Simulation basics

      Simulation basics


        Article summary

        A simulation is a way of getting data by running a virtual experiment - i.e. taking observations - from a model of something real.

        For example, you could take a model of a plant that simulates the natural variation in height found in these plants, and at the click of a button measure the height of (say) 100 of these plants.

        Why is this useful?

        • You can 'play' with the data, and get an intuitive understanding of what random variation looks like.
        • You can generate a dataset that will look similar to a real experiment you want to run, and use it to plan your analysis and visualization up-front.

        For example, you can learn more about:

        • How many samples will I need to get a meaningful result?
        • What kind of visualizations will be relevant?
        • What might the data look like if my hypothesis is true, or not true?
        • If I ran the experiment multiple times, how might the results vary?

        Multiple variable scenarios

        In the example above, height was a single, numeric variable. You can also use models with categorical variables, and having multiple variables. The distribution of one variable can depend on another, so your simulation can model correlation or causation.

        Here is an example where the mean of a numeric variable is clearly related to the value of a categorical variable with two possible values:

        Using these you can simulate more complex experiments, and again generate datasets of possible results.

        This also allows you to play with statistical tests on the results, and get an intuitive understanding of concepts like 95% confidence interval and the P-value.

        And there's more...

        You can also try:

        • All the other ways variables can affect each other (e.g. model data with a line of best fit for linear regression, or categorical variables affecting the probabilities of other categorical variables)
        • Group activities, where groups of students run tests on the same simulation model
        • Generating datasets from multiple result sets, to look at mathematical properties of e.g. mean of means and learn about other statistical concepts


        Learn more
        See the other articles in this Simulations section of the User Guide to learn what else can be done with Simulations.




        Was this article helpful?