R: Plotting a graph with different colors of points based on advanced criteria

Posted by balconydoor on Stack Overflow See other posts from Stack Overflow or by balconydoor
Published on 2014-08-25T10:12:20Z Indexed on 2014/08/25 16:20 UTC
Read the original article Hit count: 190

Filed under:
|

What I would like to do is a plot (using ggplot), where the x axis represent years which have a different colour for the last three years in the plot than the rest. The last three years should also meet a certain criteria and based on this the last three years can either be red or green. The criteria is that the mean of the last three years should be less (making it green) or more (making it red) than the 66%-percentile of the remaining years. So far I have made two different functions calculating the last three year mean:

LYM3 <- function (x) {
  LYM3 <- tail(x,3)
  mean(LYM3$Data,na.rm=T)
}

And the 66%-percentile for the remaining:

perc66 <- function(x) {
  percentile <- head(x,-3)
  quantile(percentile$Data, .66, names=F,na.rm=T) 
}

Here are two sets of data that can be used in the calculations (plots), the first which is an example from my real data where LYM3(df1) < perc66(df1) and the second is just made up data where LYM3 > perc66.

df1<- data.frame(Year=c(1979:2010),
                Data=c(347261.87,  145071.29,   110181.93,  183016.71,  210995.67,  205207.33,  103291.78,  247182.10,  152894.45,  170771.50,  206534.55,  287770.86,  223832.43,  297542.86,  267343.54,  475485.47,  224575.08,  147607.81,  171732.38,  126818.10,  165801.08,  136921.58,  136947.63,  83428.05,   144295.87,  68566.23,   59943.05,   49909.08,   52149.11,   117627.75,  132127.79,  130463.80))
df2 <- data.frame(Year=c(1979:2010),
                  Data=c(sample(50,29,replace=T),75,75,75))

Here’s my code for my plot so far:

plot <- ggplot(df1, aes(x=Year, y=Data)) +
  theme_bw() +
  geom_point(size=3, aes(colour=ifelse(df1$Year<2008, "black",ifelse(LYM3(df1) < perc66(df1),"green","red")))) +
  geom_line() +
  scale_x_continuous(breaks=c(1980,1985,1990,1995,2000,2005,2010), limits=c(1978,2011))
plot

As you notice it doesn’t really do what I want it to do. The only thing it does seem to do is that it turns the years before 2008 into one level and those after into another one and base the point colour off these two levels.

Since I don’t want this year to be stationary either, I made another tiny function:

fun3 <- function(x) {
df <- subset(x, Year==(max(Year)-2))
df$Year
}

So the previous code would have the same effect as:

geom_point(size=3, aes(colour=ifelse(df1$Year<fun3(df1), "black","red"))) 

But it still does not care about my colours. Why does it make the years into levels? And how come an ifelse function doesn’t work within another one in this case? How would it be possible to the arguments to do what I like? I realise this might be a bit messy, asking for a lot at the same time, but I hope my description is pretty clear. It would be helpful if someone could at least point me in the right direction.

I tried to put the code for the plot into a function as well so I wouldn’t have to change the data frame at all functions within the plot, but I can’t get it to work.

Thank you!

© Stack Overflow or respective owner

Related posts about r

    Related posts about ggplot2