Results of RQ1: What is the history of the ELIXIR evaluation questions?¶

What are the ELIXIR evaluation questions?

You can find the ELIXIR evaluation questions here.

This paper deals only with the mandatory questions 5 to and including 9.

What is the ancestry of the NBIS questions?¶

The paper where these questions were described first in [Gurwitz et al., 2020] . In that paper, one can read that these questions are based on [Jordan et al., 2018] and [Brazas & Ouellette, 2016]. These last two papers do not reference any academic papers on where their questions originated from.

Development of the questions¶

ELIXIR developed these evaluation questions to, as quoted from [Gurwitz et al., 2020]:

describe the audience demographic being reached by ELIXIR training events
assess the quality of ELIXIR training events directly after they have taken place'

The resulting metrics can be found at https://training-metrics-dev.elixir-europe.org/all-reports.

This is what is written about how the ELIXIR short-term evaluation questions came to be (quote from [Gurwitz et al., 2020]):

We were interested in participant satisfaction as a reflection on training quality in order to be able to inform best practice for ELIXIR training. We acknowledge that training quality is more complex than solely participant satisfaction and that the community would benefit from future work to obtain a fuller picture on training quality.

This paragraph shows that this ELIXIR group took the liberty of adding questions besides its two primary sources.

Again from [Gurwitz et al., 2020] we read:

These metrics were developed out of those already collected by ELIXIR training providers, as well as from discussions with stakeholders, external training providers, and literature review [Brazas & Ouellette, 2016][Jordan et al., 2018]

There are no references to the literature that was reviewed besides these two papers.

Neither does the referred literature:

[Brazas & Ouellette, 2016] shows the results of surveys from bioinformatics workshops. The survey questions where taken from other sources (i.e., the Society for Experimental Biology and the Global Organisation for Bioinformatics Learning, Education and Training), without any reference to the literature. It is not described how the evaluation questions came to be and with which reasoning the best were selected
[Jordan et al., 2018] shows the results of surveys from Data Carpentry workshops. Also here, it is not described how the evaluation questions came to be and with which reasoning the best were selected: this paper has zero references to the literature

Taking a closer look at the evaluation questions of [Jordan et al., 2018], we see that some questions of its evaluations were not copied to the ELIXIR evaluation. The reasoning why some questions were copied and some not is unpublished.

Which questions were not copied to the ELIXIR evaluation?

One such removed evaluation question is to let learners self-assess their confidence in learning outcomes.

How does such a question look like?

Here is an example of questions to let learners self-assess themselves (from [Plaza et al., 2002]):

Figure from Plaza et al., 2002

How are the analysing such questions look like?

Here we can see the results of learners self-assessing their competences before and after the teaching session, figure from [Jordan et al., 2018]:

Figure from Jordan et al., 2018

Here we can see a similar results for an earlier paper [Raupach et al., 2011]:

Figure from Raupach et al., 2011

Should that question have been copied to the ELIXIR evaluation?

Probably.

We know that this self-assessment does not relate to actual skill [Liaw et al., 2012] (with more studies showing this in that paper). However, there is some evidence that self-assessment is informative to evaluate a course curriculum [Plaza et al., 2002] and teacher effectiveness [Raupach et al., 2011], although other studies argue that more measurements are needed to properly assess teacher effectiveness [Darling‐Hammond et al., 2010].

References¶

[Ang et al., 2018] Ang, Lawrence, Yvonne Alexandra Breyer, and Joseph Pitt. "Course recommendation as a construct in student evaluations: will students recommend your course?." Studies in Higher Education 43.6 (2018): 944-959.
[Brazas & Ouellette, 2016] Brazas, Michelle D., and BF Francis Ouellette. "Continuing education workshops in bioinformatics positively impact research and careers." PLoS computational biology 12.6 (2016): e1004916.
[Darling‐Hammond et al., 2010] Darling‐Hammond, Linda, Xiaoxia Newton, and Ruth Chung Wei. "Evaluating teacher education outcomes: A study of the Stanford Teacher Education Programme." Journal of education for teaching 36.4 (2010): 369-388.
[Gurwitz et al., 2020] Gurwitz, Kim T., et al. "A framework to assess the quality and impact of bioinformatics training across ELIXIR." PLoS computational biology 16.7 (2020): e1007976. website
[Jordan et al., 2018] Jordan, Kari, François Michonneau, and Belinda Weaver. "Analysis of Software and Data Carpentry’s pre-and post-workshop surveys." Software Carpentry. Retrieved April 13 (2018): 2023. PDF
[Liaw et al., 2012] Liaw, Sok Ying, et al. "Assessment for simulation learning outcomes: a comparison of knowledge and self-reported confidence with observed clinical performance." Nurse education today 32.6 (2012): e35-e39.
[Roxå et al., 2021] Roxå, Torgny, et al. "Reconceptualizing student ratings of teaching to support quality discourse on student learning: a systems perspective." Higher Education (2021): 1-21.
[Raupach et al., 2011] Raupach, Tobias, et al. "Towards outcome-based programme evaluation: using student comparative self-assessments to determine teaching effectiveness." Medical teacher 33.8 (2011): e446-e453.
[Plaza et al., 2002] Plaza, Cecilia M., et al. "Curricular evaluation using self-efficacy measurements." American Journal of Pharmaceutical Education 66.1 (2002): 51-54.
[Uttl et al., 2017] Uttl, Bob, Carmela A. White, and Daniela Wong Gonzalez. "Meta-analysis of faculty's teaching effectiveness: Student evaluation of teaching ratings and student learning are not related." Studies in Educational Evaluation 54 (2017): 22-42.