After importing data into Stata, we shall set the variables
names and labels in a way that anyone is capable of understanding the
information available in the database.
We put the abbreviated name to the variable, for instance gdp. And in
the label we put the description “Gross domestic product”. I recommend avoiding
capital letters in the name of variables, it will make your life easier.
We are going to use a share of the data from the Penn World
table. Complete database can be downloaded from here, and the Stata database we
are using from here. Now the original
description of the database if the following:
Maybe you can guess what each variable is, but is
recommendable we change the variables names and labels so we can precisely know
the information we have. We can tackle
this task in two different ways (as almost everything we want to do in Stata).
One is by using commands and the other is by using the user friendly windows.
Commands
To change the name of the variable we use “rename” and for
changing the label we use “label variable”
rename “old name” “new name”
label variable “variable name” “ “new label” “
rename countrycode cntrycode
label variable contrycode "country code"
Then, if we want to all the variables an ending, we can use the following:
rename * *_home
Then, if we want to all the variables an ending, we can use the following:
rename * *_home
It is strongly recommendable to write down the commands on
the do file. In this way we will be always able to apply them without needing
to re-write them all. You can download the do file here.
Windows
Changing variables names and labels can be done without
commands. I personally prefer this way of managing them. In the Stata window,
on the top, we will find the following icon:
Then, it is quite easy:
Change and apply.
While you change the names and labels, you will notice that
in the bar we commands are recorded the commands will appear. You may copy them
to the do.
After the changes our database will be the following:
Value label
Sometimes we need to give a label to a variable. More when
we are dealing with qualitative data. Supouse we create a variable named Music
Quality (mscquality), and we divide the periods of time according to the
quality of music. Our variable will have three different values: good (1),
regular (2) and bad (3), but originally the information contained in mscquality
is going to be just 1, 2 and 3. If we want each number to have a label (useful
when we do tables, for example) we have to use the following commands:
label define “name of the label” number “label we give to the
number”
label values “variable that has the values” “name of the
group of labels”
Would be:
label define music 1 "good" 2 "regular"
3 "bad"
label values mscquality music
label variable mscquality "music quality"
This variable is completely made up. However, it would be
interesting to explore how quality of music affects economic growth.
By clicking here you can download a do file with the labels
for the firm level classification NACE Rev.2. 4 and 1 digits classification. I
did it myself. I hope I save you a few
hours.
In the next section we look into the description of the
database.




No hay comentarios:
Publicar un comentario