Sara Amato <[log in to unmask]>
> On Dec 5, 2012, at 11:23 AM, Jonathan Rochkind wrote:
> > Hmm, it's quite possible you know more about statistics than me, but...
> > Usually equations for calculating confidence level are based on
> > the assumption of a random sample, not a volunteering
> > self-selected sample.
> I'd been staying out of this discussion, but the thought occurs to
> me that someone with access to the list of subscribers might run
> that against a list of traditional boy/girl names, and be able to
> make some guesses….
With my (rather dusty through lack of formal use) stats grad hat on,
I'd say Jonathan Rochkind is correct: the assumptions behind those
calculations are violated. http://www.jerrydallal.com/LHSP/ci.htm
explains more about confidence intervals, but the usual calculations
require independent random sampling.
(LHSP was a good web book and may be worth a read if you want help
with stats, but it seems that there won't be any more web editions for
now, thanks to the evil Kindle system. If only it were FOSS.)
What happened here is sometimes called a Self-selected Listener Online
Poll, like the radio stations or newspapers do, and it's not random.
It may still be informative, but I'd not suggest the calculated
confidence intervals are valid.
Guessing from the names may be informative - especially about how many
people use forms that aren't easily identifiable in that way - but I
think the usual approach would be to use random numbers to draw a
sample from the subscribers and just ask those the detailed questions.
Then you could work out a CI and so on in the usual way.
Some years ago, I wrote more about surveying at
if you want overkill. Some links are stale at the moment.
Hope that helps,
MJ Ray (slef), member of www.software.coop, a for-more-than-profit co-op.
http://koha-community.org supporter, web and library systems developer.
In My Opinion Only: see http://mjr.towers.org.uk/email.html
Available for hire (including development) at http://www.software.coop/