R - removing rows and replacing values using conditions from multiple columns

Posted by lecodesportif on Stack Overflow See other posts from Stack Overflow or by lecodesportif
Published on 2011-01-15T20:45:05Z Indexed on 2011/01/15 20:53 UTC
Read the original article Hit count: 262

Filed under:

I want to filter out all values of var3 < 5 while keeping at least one occurrence of each value of var1.

> foo <- data.frame(var1= c(1, 1, 2, 3, 3, 4, 4, 5), var2=c(9, 5, 13, 9, 12, 11, 13, 9), var3=c(6, 8, 3, 6, 4, 7, 2, 9))
> foo
  var1 var2 var3
1    1    9    6
2    1    5    8
3    2   13    3
4    3    9    6
5    3   12    4
6    4   11    7
7    4   13    2
8    5    9    9

subset(foo, (foo$var3>=5)) would remove row 3, 5 and 7 and I would have lost var1==2.

  • I want to remove the row if there is another value of var1 that fulfills the condition foo$var3 >= 5. See row 5.
  • I want to keep the row, assiging NA to var2 and var3 if all occurrences of a value var1 do not fulfill the condition foo$var3 >= 5.

This is the result I expect:

  var1 var2 var3
1    1    9    6
2    1    5    8
3    2   NA   NA
4    3    9    6
6    4   11    7
8    5    9    9

This is the closest I got:

> foo$var3[ foo$var3 < 5 ] = NA
> foo$var2[ is.na(foo$var3) ] = NA
> foo
  var1 var2 var3
1    1    9    6
2    1    5    8
3    2   NA   NA
4    3    9    6
5    3   NA   NA
6    4   11    7
7    4   NA   NA
8    5    9    9

So I guess I just need to know how to conditionally remove the row.

© Stack Overflow or respective owner

Related posts about r