J. J. Allaire, J. Horner, V. Marti, and N. Porte, Markdown: 'Markdown' rendering for R, 2015.

E. Anthoine, L. Moret, A. Regnault, V. Sébille, and J. Hardouin, Sample size used to validate a scale: A review of publications on newly-developed patient reported outcomes measures, Health and Quality of Life Outcomes, vol.12, 2014.

L. A. Ark, Mokken scale analysis in R, Journal of Statistical Software, vol.20, issue.11, pp.1-19, 2007.

F. B. Baker and S. Kim, The basics of item response theory using R, 2017.

D. J. Bartholomew, Scaling unobservable constructs in social science, Journal of the Royal Statistical Society: Series C (Applied Statistics), vol.47, issue.1, pp.1-13, 1998.

T. Bond and C. M. Fox, Applying the Rasch model: Fundamental measurement in the human sciences, 2015.

D. Borsboom, Measuring the mind: Conceptual issues in contemporary psychometrics, 2005.

D. Borsboom, Latent variable theory. Measurement: Interdisciplinary Research and Perspectives, vol.6, pp.25-53, 2008.

D. Borsboom, A network theory of mental disorders, World Psychiatry: Official Journal of the World Psychiatric Association (WPA), vol.16, issue.1, pp.5-13, 2017.

D. Borsboom, M. Rhemtulla, A. O. Cramer, H. L. Van-der-maas, M. Scheffer et al., Kinds versus continua: A review of psychometric approaches to uncover the structure of psychiatric constructs, Psychological Medicine, vol.46, issue.08, pp.1567-1579, 2016.

E. Broadbent, K. J. Petrie, J. Main, and J. Weinman, The brief illness perception questionnaire, Journal of Psychosomatic Research, vol.60, issue.6, pp.631-637, 2006.

J. C. Cappelleri, J. Lundy, J. Hays, and R. D. , Overview of classical test theory and item response theory for the quantitative assessment of items in developing patient-reported outcomes measures, Clinical Therapeutics, vol.36, issue.5, pp.648-662, 2014.

D. Cella, R. Gershon, J. Lai, and S. Choi, The future of outcomes measurement: Item banking, tailored short-forms, and computerized adaptive assessment, Quality of Life Research, vol.16, issue.1, pp.133-141, 2007.

D. Cella, W. Riley, A. Stone, N. Rothrock, B. Reeve et al., The patient-reported outcomes measurement information system (PROMIS) developed and tested its first wave of adult self-reported health outcome item banks, Journal of Clinical Epidemiology, vol.63, issue.11, pp.1179-1194, 2005.

R. P. Chalmers, Mirt: A multidimensional item response theory package for the R environment, Journal of Statistical Software, 2012.

E. H. Chan, Standards and guidelines for validation practices: Development and evaluation of measurement instruments, Validity and validation in social, behavioral, and health sciences, pp.9-24, 2014.

W. Chen, W. Lenderking, Y. Jin, K. W. Wyrwich, H. Gelhorn et al., Is Rasch model analysis applicable in small sample size pilot studies for assessing item characteristics? An example using PROMIS pain behavior item bank data, Quality of Life Research, vol.23, issue.2, pp.485-493, 2014.

J. Clatworthy, D. Buick, M. Hankins, J. Weinman, and R. Horne, The use and reporting of cluster analysis in health psychology: A review, British Journal of Health Psychology, vol.10, issue.3, pp.329-358, 2005.

J. M. Cortina, What Is coefficient alpha?: An examination of theory and applications, Journal of Applied Psychology, vol.78, issue.1, pp.98-104, 1993.

G. Costantini, S. Epskamp, D. Borsboom, M. Perugini, R. Mõttus et al., State of the art personality research: A tutorial on network analysis of personality data in R, Journal of Research in Personality, vol.54, pp.13-29, 2015.

R. Crutzen and G. Y. Peters, Scale quality: Alpha is an inadequate estimate and factor-analytic evidence is needed first of all, Health Psychology Review, vol.11, issue.3, pp.242-247, 2017.

T. J. Dunn, T. Baguley, and V. Brunsden, From alpha to omega: A practical solution to the pervasive problem of internal consistency estimation, British Journal of Psychology, vol.105, issue.3, pp.399-412, 2014.

S. Epskamp, D. Borsboom, and E. I. Fried, Estimating psychological networks and their accuracy: A tutorial paper, Behavior Research Methods, pp.1-18, 2017.

B. S. Everitt, S. Landau, M. Leese, and D. Stahl, Cluster analysis, 2011.

J. K. Flake, J. Pek, and E. Hehman, Construct validation in social and personality research: Current practice and recommendations, Social Psychological and Personality Science, 2017.

F. J. Floyd and K. F. Widaman, Factor analysis in the development and refinement of clinical assessment instruments, Psychological Assessment, vol.7, issue.3, pp.286-299, 1995.

C. C. Fok and D. Henry, Increasing the sensitivity of measures to change, Prevention Science : The Official Journal of the Society for Prevention Research, vol.16, issue.7, pp.978-986, 2015.

C. Friedman, J. Rubin, J. Brown, M. Buntin, M. Corn et al., Toward a science of learning systems: A research agenda for the high-functioning learning health system, Journal of the American Medical Informatics Association, vol.22, issue.1, pp.43-50, 2015.

J. F. Fries, E. Krishnan, M. Rose, B. Lingala, and B. Bruce, Improved responsiveness and reduced sample size requirements of PROMIS physical function scales with item response theory, Arthritis Research & Therapy, vol.13, 2011.

M. H. Frost, B. B. Reeve, A. M. Liepa, J. W. Stauffer, and R. D. Hays, What Is sufficient evidence for the reliability and validity of patient-reported outcome measures? Value in Health, vol.10, pp.94-105, 2007.

C. Gandrud, Reproducible research with R and R studio, 2013.

J. M. Graham, Congeneric and (essentially) Tau-equivalent estimates of score reliability: What they are and how to use them, Educational and Psychological Measurement, vol.66, issue.6, pp.930-944, 2006.

K. Hamilton, M. M. Marques, and B. T. Johnson, Advanced analytic and statistical methods in health psychology, Health Psychology Review, vol.11, issue.3, pp.217-221, 2017.

R. D. Hays, L. S. Morales, and S. P. Reise, Item response theory and health outcomes measurement in the 21st century, Medical Care, vol.38, issue.9, pp.28-42, 2000.

B. T. Hemker, K. Sijtsma, and I. W. Molenaar, Selection of unidimensional scales from a multidimensional item bank in the Polytomous Mokken I RT model, Applied Psychological Measurement, vol.19, issue.4, pp.337-352, 1995.

J. Hobart and S. Cano, Improving the evaluation of therapeutic interventions in multiple sclerosis: The role of new psychometric methods, Health Technology Assessment, issue.12, pp.1-177, 2009.

T. P. Hogan and J. Agnello, An empirical study of reporting practices concerning measurement validity, Educational and Psychological Measurement, vol.64, issue.5, pp.802-812, 2004.

L. Hu and P. M. Bentler, Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives, Structural Equation Modeling: A Multidisciplinary Journal, vol.6, issue.1, pp.1-55, 1999.

J. A. Hutcheon, A. Chiolero, and J. A. Hanley, Random measurement error and regression dilution bias, BMJ, vol.340, 2010.

D. L. Jackson, J. A. Gillaspy, and R. Purc-stephenson, Reporting practices in confirmatory factor analysis: An overview and some recommendations, Psychological Methods, vol.14, issue.1, pp.6-23, 2009.

M. P. Jensen, S. E. Strom, J. A. Turner, and J. M. Romano, Validity of the sickness impact profile Roland scale as a measure of dysfunction in chronic pain patients, Pain, vol.50, issue.2, pp.157-162, 1992.

A. Kamata and D. J. Bauer, A note on the relation between factor analytic and item response theory models, Structural Equation Modeling: A Multidisciplinary Journal, vol.15, issue.1, pp.136-153, 2008.

K. Kelley and Y. Cheng, Estimation of and confidence interval formation for reliability coefficients of homogeneous measurement instruments, Methodology, vol.8, issue.2, pp.39-50, 2012.

F. Leisch, Sweave: Dynamic generation of statistical reports using literate data analysis, Compstat 2002 -proceedings in computational statistics, pp.575-580, 2002.

C. Li, Confirmatory factor analysis with ordinal data: Comparing robust maximum likelihood and diagonally weighted least squares, Behavior Research Methods, vol.48, issue.3, pp.936-949, 2016.

J. M. Linacre, Sample size and item calibration or person measure stability, Rasch Measurement Transactions, vol.7, issue.4, 1994.

R. C. Maccallum, K. F. Widaman, S. Zhang, and S. Hong, Sample size in factor analysis, Psychological Methods, vol.4, pp.84-99, 1999.

M. Maechler, P. Rousseeuw, A. Struyf, M. Hubert, and K. Hornik, Cluster: Cluster analysis basics and extensions, 2017.

P. Mair and R. Hatzinger, Extended Rasch modeling: The eRm package for the application of IRT models in R, Journal of Statistical Software, vol.20, issue.9, 2007.

M. Marshall, A. Lockwood, C. Bradley, C. Adams, C. Joy et al., Unpublished rating scales: A major source of bias in randomised controlled trials of treatments for schizophrenia, The British Journal of Psychiatry, vol.176, issue.3, pp.249-252, 2000.

C. A. Mchorney and A. R. Tarlov, Individual-patient monitoring in clinical practice: Are available health status surveys adequate?, Quality of Life Research, vol.4, issue.4, pp.293-307, 1995.

R. R. Meijer and J. J. Baneke, Analyzing psychopathology items: A case for nonparametric item response theory modeling, Psychological Methods, vol.9, issue.3, pp.354-368, 2004.

R. R. Meijer, A. S. Niessen, and J. N. Tendeiro, A practical guide to check the consistency of item response patterns in clinical research through person-Fit statistics examples and a computer program, Assessment, vol.23, issue.1, pp.52-62, 2016.

R. Melzack, The short-form McGill pain questionnaire, Pain, vol.30, issue.2, pp.91074-91082, 1987.

L. B. Mokkink, C. B. Terwee, D. L. Patrick, J. Alonso, P. W. Stratford et al., The COSMIN checklist for assessing the methodological quality of studies on measurement properties of health status measurement instruments: An international Delphi study, Quality of Life Research, vol.19, issue.4, pp.539-549, 2010.

J. C. Nunnally and I. H. Bernstein, Psychometric theory, 1994.

G. Y. Peters, A. L. Dima, A. M. Plass, R. Crutzen, C. Gibbons et al., Measurement in health psychology: Combining theory, qualitative, and quantitative methods to do it right: Methods in health psychology symposium VI. The European Health Psychologist, vol.18, pp.235-246, 2016.

R. Rabin and F. D. Charro, EQ-SD: A measure of health status from the EuroQol group, Annals of Medicine, vol.33, issue.5, pp.337-343, 2001.

. R-core-team, R: A language and environment for statistical computing, 2013.

B. B. Reeve, K. W. Wyrwich, A. W. Wu, G. Velikova, C. B. Terwee et al., ISOQOL recommends minimum standards for patient-reported outcome measures used in patient-centered outcomes and comparative effectiveness research, Quality of Life Research, vol.22, issue.8, pp.1889-1905, 2013.

S. P. Reise, A. T. Ainsworth, and M. G. Haviland, Item response theory fundamentals, applications, and promise in psychological research, Current Directions in Psychological Science, vol.14, issue.2, pp.95-101, 2005.

W. Revelle, Psych: Procedures for psychological, psychometric, and personality research, 2017.

W. Revelle and R. E. Zinbarg, Coefficients alpha, beta, omega, and the glb: Comments on sijtsma, Psychometrika, vol.74, issue.1, p.145, 2009.

D. Rizopoulos, Ltm: An R package for latent variable modeling and item response analysis, Journal of Statistical Software, issue.5, p.17, 2007.

M. Roland and R. Morris, A study of the natural history of back pain. Part I: Development of a reliable and sensitive measure of disability in low-back pain, Spine, vol.8, issue.2, pp.141-144, 1983.

Y. Rosseel, Lavaan: An R package for structural equation modeling, Journal of Statistical Software, vol.48, issue.2, pp.1-36, 2012.

R. Sawatzky, E. K. Chan, B. D. Zumbo, S. Ahmed, S. J. Bartlett et al., Modern perspectives of measurement validation emphasize justification of inferences based on patient-reported outcome scores: Seventh paper in a series on patient reported outcomes, Journal of Clinical Epidemiology, 2016.

T. A. Schmitt, Current methodological considerations in exploratory and confirmatory factor analysis, Journal of Psychoeducational Assessment, vol.29, issue.4, pp.304-321, 2011.

W. H. Schuur, Mokken scale analysis: Between the Guttman scale and parametric item response theory, Political Analysis, vol.11, issue.2, pp.139-163, 2003.

K. Sijtsma, On the Use, the misuse, and the very limited usefulness of Cronbach's alpha, Psychometrika, vol.74, issue.1, p.107, 2009.

K. Sijtsma and B. T. Hemker, Nonparametric polytomous IRT models for invariant item ordering, with results for parametric models, Psychometrika, vol.63, issue.2, pp.183-200, 1998.

K. Sijtsma and I. W. Molenaar, Introduction to nonparametric item response theory, 2002.

J. Singh, Tackling measurement problems with item response theory, Journal of Business Research, vol.57, issue.2, pp.302-304, 2004.

S. M. Skevington, M. Lotfy, and K. A. Connell, The World Health Organization's WHOQOL-BREF quality of life assessment: Psychometric properties and results of the international field trial. A report from the WHOQOL group, Quality of Life Research, vol.13, issue.2, pp.299-310, 2004.

J. Stochl, P. B. Jones, and T. J. Croudace, Mokken scale analysis of mental health and wellbeing questionnaire item responses: A non-parametric IRT method in empirical research for applied health researchers, BMC Medical Research Methodology, vol.12, 2012.

J. H. Straat, L. A. Van-der-ark, and K. Sijtsma, Minimum sample size requirements for Mokken scale analysis, Educational and Psychological Measurement, vol.74, issue.5, pp.809-822, 2014.

M. W. Stroud, P. E. Mcknight, and M. P. Jensen, Assessment of self-reported physical activity in patients with chronic pain: Development of an abbreviated roland-morris disability scale, The Journal of Pain, vol.5, issue.5, pp.257-263, 2004.

P. Torfs and C. Brauer, A (very) short introduction to R, 2014.

R. Watson, L. A. Van-der-ark, L. Lin, R. Fieo, I. J. Deary et al., Item response theory: How Mokken scaling can be used in clinical practice, Journal of Clinical Nursing, vol.21, pp.2736-2746, 2012.