Files
This page includes the data files for the Introduction to R workshop. Most are in csv format and can be imported directly from R using the following URL syntax:
read.csv (“http://mc-oman.on-rev.com/R/resources/Datafiles/name_of_file.csv”) or enter http://mc-oman.on-rev.com/R/resources/Datafiles/name_of_file.csv into import file dialog box in R-studio.
This however does not work (yet) behind SQU and perhaps other firewalls.Some computers will download the content of the page on a new browser page. To avoid this behaviour, please “control “or “alt” click on the link to download the file.
You can also download the whole series of files as a zip archive, you can then install on your computer and used as regular files.
Unstacked.csv
A fictitious data set in which the data have been unstacked.
MorayEelObservations.csv
Documents the abundances of Moray eels observed by divers at three locations both in the morning and in the afternoon of the same days.
AcidAlfalfa.csv
Reports the growth of alfalfa plants at 3 pH conditions in a blocked experiment.
SpruceMoths.csv
Documents the catches of spruce moth collected by 3 different types of traps at 4 different heights in trees.
SmokingStudents.csv
A study on the habits of students in terms of smoking and exercise according to gender.
FishGender.csv
A study on the habits of students in terms of smoking and exercise according to gender.
TemperatureGender.csv
A data set comparing the cost of the same groceries items in different stores.
PlantControl.csv
Effect of responsibilization on patient well being. All patients received a plant. Some were in charge of its maintenance, for others, a “gardener” maintained the plant.
TreeDiameter.csv
A series of measurements of different tree diameters at DBH and height.
FishGrowth.csv
A series of length/age values for a fish (Non-linear regression)
Menarche.csv
A file describing the proportion of girls of different age being reproductively mature (Logistic regression)
Ultrasound.csv
These data are the result of a study measuring ultrasonic response (y: whatever that is) as a function of distance in a metal (x).
Breastfeeding.csv
Measuring how much a baby drinks during breastfeeding is not straightforward. In this study two methods were used (deuterium dilution and infant weight) on 14 infants.
Diamonds10000.csv
This datasets includes a subset of the diamonds dataset from ggplot2. It describes the quality of 10000 randomly selected diamonds from this large dataset.
SourCream.csv
This dataset correspond to measurements of the acidity of sour cream made with varying bacteria strain (called here starter, for starter culture). Two additional variables were introduced in the experiment: batch = days of the week the experiment took place and position (where in the “incubator” the jars were positioned.
ChildrenHeight.csv
This simple dataset correspond to the height of randomly selected 12 y old pupils in a school. In addition to the gender (males vs females), the height and weight were also measured.
PhosphorusLeaves.csv
The data correspond to the phosphorus content measured in the composted leaves of different varieties of apple trees in the same orchard.
Lead.csv
Data set of lead contamination in Children.
GDP.csv
This is a list of per capita GDP for many countries.
DeathRate.csv
This dataset contains mortality rate for many countries.
NormalTemp.csv
Documents the “normal” temperature of male and female volunteers.
Concentrations.csv
This datasets contains information about the yield of a crop in relation to various concentrations of nitrogen in the soil. It is in a “wide” format.
A fictitious data set in which the data have been unstacked.
MorayEelObservations.csv
Documents the abundances of Moray eels observed by divers at three locations both in the morning and in the afternoon of the same days.
AcidAlfalfa.csv
Reports the growth of alfalfa plants at 3 pH conditions in a blocked experiment.
SpruceMoths.csv
Documents the catches of spruce moth collected by 3 different types of traps at 4 different heights in trees.
SmokingStudents.csv
A study on the habits of students in terms of smoking and exercise according to gender.
FishGender.csv
A study on the habits of students in terms of smoking and exercise according to gender.
TemperatureGender.csv
A data set comparing the cost of the same groceries items in different stores.
PlantControl.csv
Effect of responsibilization on patient well being. All patients received a plant. Some were in charge of its maintenance, for others, a “gardener” maintained the plant.
TreeDiameter.csv
A series of measurements of different tree diameters at DBH and height.
FishGrowth.csv
A series of length/age values for a fish (Non-linear regression)
Menarche.csv
A file describing the proportion of girls of different age being reproductively mature (Logistic regression)
Ultrasound.csv
These data are the result of a study measuring ultrasonic response (y: whatever that is) as a function of distance in a metal (x).
Breastfeeding.csv
Measuring how much a baby drinks during breastfeeding is not straightforward. In this study two methods were used (deuterium dilution and infant weight) on 14 infants.
Diamonds10000.csv
This datasets includes a subset of the diamonds dataset from ggplot2. It describes the quality of 10000 randomly selected diamonds from this large dataset.
SourCream.csv
This dataset correspond to measurements of the acidity of sour cream made with varying bacteria strain (called here starter, for starter culture). Two additional variables were introduced in the experiment: batch = days of the week the experiment took place and position (where in the “incubator” the jars were positioned.
ChildrenHeight.csv
This simple dataset correspond to the height of randomly selected 12 y old pupils in a school. In addition to the gender (males vs females), the height and weight were also measured.
PhosphorusLeaves.csv
The data correspond to the phosphorus content measured in the composted leaves of different varieties of apple trees in the same orchard.
Lead.csv
Data set of lead contamination in Children.
GDP.csv
This is a list of per capita GDP for many countries.
DeathRate.csv
This dataset contains mortality rate for many countries.
NormalTemp.csv
Documents the “normal” temperature of male and female volunteers.
Concentrations.csv
This datasets contains information about the yield of a crop in relation to various concentrations of nitrogen in the soil. It is in a “wide” format.
HorseshoeCrabs.csv
Documents the abundances of spawning horseshoe crabs on 24 beaches in the US during two consecutive years.
MoodProzac.csv
Reports on the mood of 9 patients before and after taking a small daily dose of Prozac.
LatinSQ.csv
Growth of a crop in a true LatinSquare Experimental Design.
AppleRootStock.csv
Documents the growth of apple trees (extension in cm) grafted on 5 different root stocks.
Abalone.csv
A series of morphometric measurements made on more than 4000 abalone.
TemperatureGender.csv
A series of temperature measured on both male and female patients at different times
Cats.csv
Weight of the body and heart of a series of males and females cats.
DiversPerformances.csv
Divers loose their concentration at depth as Nitrogen gains toxicity. This sets measures the ability of divers to perform arithmetic operation at the surface and at depth.
HyperVentilation.csv
A small data frame in which the same people were asked to hold their breath as long as possible with and without prior hyperventilation.
ChirpFreq.csv
Two variables: the numbers of chirp per second for a ground cricket and the Air Temperature.
CystolicAgeWeight.csv
The data documents the Cystolic pressure (blood pressure) in relation to Age and Weight of a series of patient in the hospital.
MortalityRate.csv
This dataset shows the mortality rate (/1000 habitants) in relation to number of doctors (per 1000 inhabitants), number of hospitals (/100000 inhabitants0 and population density (per km2).
Pasture.csv
The dataset links time (in days) with the yield of a pasture (kg).
Statex77.csv
The dataset presents a whole series of variable associated with the different states of the US in 1977.
Packages.csv
The dataset gives the sales of 4 different cereal packages (same cereal) in similar supermarkets (20 supermarkets). In one, a fire did not allow the experiment to continue, and no data were collected.
Dioxins.csv
The data shows the changes in dioxin concentration in the digestive glands of crabs from two location over several years. Each data point correspond to the average of several measurements.
Pcb.csv
Pub is the amount of Polychlorinated Biphenyl observed in male and female fish of different species. Each measurement correspond to a single fish.
AdNewspaper.xls
This dataset represent the number of enquiries following advertisements in different sections of a newspaper in the different days of the week. Attention, the file probably need some work.
Countries.csv
A database with a lot of information about different countries in the world.
Documents the abundances of spawning horseshoe crabs on 24 beaches in the US during two consecutive years.
MoodProzac.csv
Reports on the mood of 9 patients before and after taking a small daily dose of Prozac.
LatinSQ.csv
Growth of a crop in a true LatinSquare Experimental Design.
AppleRootStock.csv
Documents the growth of apple trees (extension in cm) grafted on 5 different root stocks.
Abalone.csv
A series of morphometric measurements made on more than 4000 abalone.
TemperatureGender.csv
A series of temperature measured on both male and female patients at different times
Cats.csv
Weight of the body and heart of a series of males and females cats.
DiversPerformances.csv
Divers loose their concentration at depth as Nitrogen gains toxicity. This sets measures the ability of divers to perform arithmetic operation at the surface and at depth.
HyperVentilation.csv
A small data frame in which the same people were asked to hold their breath as long as possible with and without prior hyperventilation.
ChirpFreq.csv
Two variables: the numbers of chirp per second for a ground cricket and the Air Temperature.
CystolicAgeWeight.csv
The data documents the Cystolic pressure (blood pressure) in relation to Age and Weight of a series of patient in the hospital.
MortalityRate.csv
This dataset shows the mortality rate (/1000 habitants) in relation to number of doctors (per 1000 inhabitants), number of hospitals (/100000 inhabitants0 and population density (per km2).
Pasture.csv
The dataset links time (in days) with the yield of a pasture (kg).
Statex77.csv
The dataset presents a whole series of variable associated with the different states of the US in 1977.
Packages.csv
The dataset gives the sales of 4 different cereal packages (same cereal) in similar supermarkets (20 supermarkets). In one, a fire did not allow the experiment to continue, and no data were collected.
Dioxins.csv
The data shows the changes in dioxin concentration in the digestive glands of crabs from two location over several years. Each data point correspond to the average of several measurements.
Pcb.csv
Pub is the amount of Polychlorinated Biphenyl observed in male and female fish of different species. Each measurement correspond to a single fish.
AdNewspaper.xls
This dataset represent the number of enquiries following advertisements in different sections of a newspaper in the different days of the week. Attention, the file probably need some work.
Countries.csv
A database with a lot of information about different countries in the world.