using subset but old variables still left

Posted by user2520852 on Stack Overflow See other posts from Stack Overflow or by user2520852
Published on 2013-06-25T16:18:04Z Indexed on 2013/06/25 16:21 UTC
Read the original article Hit count: 194

Filed under:

I am working with a data set, which is basically daily usage data (let's just say variable X and Y) by different cities (about 150 cities). I have created a subset of data for only specific cities, choosing just 3 of the 150 cities. Then when I do tapply by cities, I get means for 3 cities but also get NA for all other 147 cities that was in the data set. I am using the below coding

df<-read.csv(...)

df_sub<-subset(df,df$City==1|df$City==3|df$City==19)

X_Breakdown<-tapply(X,df_sub$City, mean, na.rm=TRUE)

Print(X_Breakdown)

                    City 1                         City 2 
                        15                             NA 
                    City 3                         City 4 
                        12                             NA 
                    City 5                         City 6 
                        NA                             NA 

Hope you get the idea. I would like to get a dataset that only contains the 3 cities that I'm interested in.

It seems that the set of variables is encoded in R, is there a way to fix this? Kindly advise. Thanks

© Stack Overflow or respective owner

Related posts about subset