## Workshops

All pre-conference workshops will be held on the 3 September 2024

# WORKSHOP 1: Bayesian estimation of microbiological dose-response functions

### Organiser and Tutor:

Dr. Jukka Ranta, Finnish Food Authority

### Objective:

Dose-response models are the last step in farm-to-fork food chain models, predicting the number of foodborne zoonotic diseases. The overall objective of the workshop is to give a theoretical introduction to Bayesian estimation of classical single-hit dose-response models, and their practical implementation in BUGS language. As prerequisite, participants could have basic knowledge of probability distributions, parameter estimation and Monte Carlo simulations, and elements of Bayesian statistics: the prior distribution and likelihood functions (conditional distribution of observable data variables).

Different types of data are discussed, experimental and outbreak data, individual and grouped data, e.g. for salmonella, campylobacter, listeria. Computational approaches for evaluating the beta-Poisson model, which has no analytical solution, exploit Monte Carlo integration within MCMC (as ‘2D Monte Carlo’), and numerical integration within MCMC. Single dose-response functions are then extended accounting variation between bacteria strain types and/or between studies. Finally, a combination of dose-response model and exposure model is discussed for obtaining population risk estimates with uncertainty.

Part I: Theory – introduction to single-hit dose-response models. Binomial, exponential, beta-binomial and beta-Poisson models.

Part II: Practicals – BUGS-model implementations of dose-response models and dose-response data sets: experimental data and outbreak data, adding expert opinions. Hierarchical models for variation between bacteria strains and/or between studies. Population risk estimation with combined exposure model and dose-response model, remarks on limitations. Examples run in R with OpenBUGS by using Google Colab.

#### Pre-requisites:

Implementation of the models is done via Google Colab so that the participants are required to have Google account to access the shared file by email web link, in their own computers, but installation of R and OpenBUGS is not necessary, although this could also be optional.

# WORKSHOP 2: Creation of an end-to-end machine learning pipeline with {tidymodels}

### Organiser and Tutor:

Antoine Bichat, Les Laboratoires Servier

### Objective:

{tidymodels} (Kuhn and Wickham 2020) brings together a collection of packages that facilitate the use of statistical learning methods (such as random forests, penalized linear models, etc.) within a unified and tidy framework (Wickham 2014). This tutorial will show you how to use these packages to preprocess data, build, train, and evaluate a model, optimize hyperparameters, and everything you need to know to carry out a supervised statistical learning project from start to finish.

#### References

Kuhn, Max, and Hadley Wickham. 2020. Tidymodels: A Collection of Packages for Modeling and

Machine Learning Using Tidyverse Principles.

Wickham, Hadley. 2014. “Tidy Data.” Journal of Statistical Software 59: 1–23

#### Pre-requisites

This workshop will be conducted in R. Familiarity with the {tidyverse}, especially {dplyr}, is recommended. To get the most out of this tutorial, please ensure that you have R version 4.1 or higher (https: //cran.r-project.org), a recent version of RStudio (https://www.rstudio.com/download), and install the R packages we will use beforehand using the following command:

install.packages(c(“tidyverse”, “tidymodels”, “glmnet”, “ranger”, “xgboost”, “finetune”, “workflowsets”, “corrr”, “vip”, “ggforce”, “ggrain”))

# WORKSHOP 3: Structural equation modelling from covariance analysis to PLS-SEM

### Organiser and Tutor:

Jean-Michel Galharret, ONIRIS Nantes

### Objective:

Structural equation modeling can be viewed as a generalization of linear regression models to complex systems. It involves studying the relationships between several blocks of matched data on individuals, described by a set of observed variables that differ according to the block. In these models, an unobserved variable (latent variable) is associated with each block and we are interested in the set of regression equations linking these variables together. The coefficients of these models can be estimated using analysis of covariances (Jöreskog, 1970, LISREL) or the PLS approach (Wold 1982). Analysis of covariances is still the most widely used approach in the human and social sciences. It is also the one with the most advanced foundations in terms of statistical validation. The PLS approach, also known as PLS-PM or more recently PLS-SEM, is often favored because of its ability to associate a component with each block (as in PCA), thus representing the unobserved variables. After a brief mathematical introduction to this type of modelling in the context of latent variable models, we will illustrate the analysis of covariance approach in psychology. Finally, the PLS-SEM model will be briefly presented. The programme is the following:

- From regression to path models: example of the mediation model.
- Including latent variables in the modeling.
- Graphical representation of a SEM.
- Mathematical background for the covariance analysis.
- Complete example from an educational study
- Introduction of the R package lavaan. In the following examples, the codes for implementing the model on the software and the interpretation of software outputs will be provided.
- Fitting a path model.
- Fitting a SEM.
- Testing invariance on a measurement model.
- Fitting a cross-lagged model
- A brief introduction of the PLS-SEM.

#### Pre-requisites

Install R (https://cran.r-project.org/) and the package lavaan (https://cran.r-project.org/web/packages/lavaan/index.html).