miércoles, 19 de agosto de 2015

First steps in Stata 2: Set names and labels

After importing data into Stata, we shall set the variables names and labels in a way that anyone is capable of understanding the information available in the database.  We put the abbreviated name to the variable, for instance gdp. And in the label we put the description “Gross domestic product”. I recommend avoiding capital letters in the name of variables, it will make your life easier.


We are going to use a share of the data from the Penn World table. Complete database can be downloaded from here, and the Stata database we are using from here.  Now the original description of the database if the following:


Maybe you can guess what each variable is, but is recommendable we change the variables names and labels so we can precisely know the information we have.  We can tackle this task in two different ways (as almost everything we want to do in Stata). One is by using commands and the other is by using the user friendly windows.

Commands

To change the name of the variable we use “rename” and for changing the label we use “label variable”

rename “old name” “new name”
label variable “variable name” “ “new label” “

rename countrycode cntrycode
label variable contrycode "country code"

Then, if we want to all the variables an ending, we can use the following:
rename * *_home

It is strongly recommendable to write down the commands on the do file. In this way we will be always able to apply them without needing to re-write them all. You can download the do file here.

Windows

Changing variables names and labels can be done without commands. I personally prefer this way of managing them. In the Stata window, on the top, we will find the following icon:


Then, it is quite easy:


Change and apply.

While you change the names and labels, you will notice that in the bar we commands are recorded the commands will appear. You may copy them to the do.

After the changes our database will be the following:



Value label

Sometimes we need to give a label to a variable. More when we are dealing with qualitative data. Supouse we create a variable named Music Quality (mscquality), and we divide the periods of time according to the quality of music. Our variable will have three different values: good (1), regular (2) and bad (3), but originally the information contained in mscquality is going to be just 1, 2 and 3. If we want each number to have a label (useful when we do tables, for example) we have to use the following commands:

label define “name of the label” number “label we give to the number”
label values “variable that has the values” “name of the group of labels”
Would be:
label define music 1 "good" 2 "regular" 3 "bad"
label values mscquality music
label variable mscquality "music quality"

This variable is completely made up. However, it would be interesting to explore how quality of music affects economic growth.

By clicking here you can download a do file with the labels for the firm level classification NACE Rev.2. 4 and 1 digits classification. I did it myself.  I hope I save you a few hours.

In the next section we look into the description of the database. 

No hay comentarios:

Publicar un comentario