ggplot2 density histogram with width=.5, vline and centered bar positions

I want a nice density (that sums to 1) histogram for some discrete data. I have tried a couple of ways to do this, but none were entirely satisfactory.

Generate some data:

#data
set.seed(-999)
d.test = data.frame(score = round(rnorm(100,1)))
mean.score = mean(d.test[,1])
d1 = as.data.frame(prop.table(table(d.test)))

The first gives the right placement of bars -- centered on top of the number -- but the wrong placement of vline(). This is because the x-axis is discrete (factor) and so the mean is plotted using the number of levels, not the values. The mean value is .89.

ggplot(data=d1, aes(x=d.test, y=Freq)) +
  geom_bar(stat="identity", width=.5) +
  geom_vline(xintercept=mean.score, color="blue", linetype="dashed")

The second gives the correct vline() placement (because the x-axis is continuous), but wrong placement of bars and the width parameter does not appear to be modifiable when x-axis is continuous (see here). I also tried the size parameter which also has no effect. Ditto for hjust.

ggplot(d.test, aes(x=score)) +
  geom_histogram(aes(y=..count../sum(..count..)), width=.5) +
  geom_vline(xintercept=mean.score, color="blue", linetype="dashed")

Any ideas? My bad idea is to rescale the mean so that it fits with the factor levels and use the first solution. This won't work well in case some of the factor levels are 'missing', e.g. 1, 2, 4 with no factor for 3 because no datapoint had that value. If the mean is 3.5, rescaling this is odd (x-axis is no longer an interval scale).

Another idea is this:

ggplot(d.test, aes(x=score)) +
  stat_bin(binwidth=.5, aes(y= ..density../sum(..density..)), hjust=-.5) +
  scale_x_continuous(breaks = -2:5) + #add ticks back
  geom_vline(xintercept=mean.score, color="blue", linetype="dashed")

But this requires adjusting the breaks, and the bars are still in the wrong positions (not centered). Unfortunately, hjust does not appear to work.

How do I get everything I want?

  • density sums to 1
  • bars centered above values
  • vline() at the correct number
  • width=.5

With base graphics, one could perhaps solve this problem by plotting twice on the x-axis. Is there some similar way here?

Answers


It sounds like you just want to make sure that your x-axis values are numeric rather than factors

ggplot(data=d1, aes(x=as.numeric(as.character(d.test)), y=Freq)) +
  geom_bar(stat="identity", width=.5) +
  geom_vline(xintercept=mean.score, color="blue", linetype="dashed") + 
  scale_x_continuous(breaks=-2:3)

which gives


Need Your Help

What does RegAsm really do? Where are files copied?

c# dll regasm dllregistration ie-plugins

We have a plugin for IE based on spicIE, the purpose is to connect to some external devices.

How can a user input an array of any length in java?

java arrays

I'm writing a program in java that requires the user to input a series of numbers, for example 1 2 3 4 .. until the user decides to stop and press enter. Normally what I would do is something like ...