
Convert the random numbers into categorical variables for treatment or control status.Īfter randomizing, output a CSV format file that contains the ID variable used for randomization and the categorical variables created from the random numbers generated.Sort the data by the unique ID: the data should be sorted such that observations are in the same order every time the code is run.Set seed: this ensures that the same random number is generated for the first observation, for the second observation, and so on, for every time the code is run.See ieboilstart for boilerplate code that standardizes Stata version within do files.


* Assign observations to control & treatment group based on their ranks Gen random_number = uniform() Įgen ordering = rank(random_number) * Assign random numbers to the observations and rank them from the smallest to the largest * Set the environment to make randomization replicable
#Egen stata software#
As opposed to randomizing in the survey software, randomizing in Stata allows for time between randomization, implementation and data collection, giving the research team the opportunity to double check assignments and fix bugs before using software in the field.Īn example of a randomization do-file follows:.The researcher has more control of the process and can check randomization balance and add stratification variables if needed.The process is transparent and reproducible.The main advantages of randomizing in Stata follow: Randomizing in Stata and subsequently preloading the generated data file into the survey software is the preferred method to randomizing in Excel or randomizing in survey software.
#Egen stata how to#
For information how to draw a stratified random sample, see Stratified Random Sample.Make sure to set the version, set the seed, sort the data, and use unique IDs when randomizing in Stata.Randomizing in Stata is preferred to randomizing in Excel or randomizing in survey software because it is transparent, reproducible, and gives the research more time to run balance tests and double check assignments.Common alternatives to using Stata for randomization include: (i) Using the Excel Rand command (ii) Randomizing directly within a chosen electronic survey platform such as Surve圜TO or (iii) randomization through a public lottery.

