I'd been staying out of this discussion, but the thought occurs to me that someone with access to the list of subscribers might run that against a list of traditional boy/girl names, and be able to make some guesses…. On Dec 5, 2012, at 11:23 AM, Jonathan Rochkind wrote: > Hmm, it's quite possible you know more about statistics than me, but... > > Usually equations for calculating confidence level are based on the assumption of a random sample, not a volunteering self-selected sample. > > If you have a self-selected sample, then the equations for "how likely is this to be a fluke" are only accurate if your self-selected sample is representative; and there aren't really any equations that can tell you how likely your self-selected sample is to be representative, it depends on the circumstances (which is why for the statistical equations to be completely valid, you need a random sample). > > Is my understanding. > > On 12/5/2012 2:18 PM, Rosalyn Metz wrote: >> Ross, >> >> I totally get what you're saying, I thought of all of that too, but >> according to everything I was reading through, the likelihood that the >> survey's results are a fluke is extremely low. Its actually the reason I >> put information in the write up about the sample size (378), population >> size (2,250), response rate (16.8%), confidence level (95%), and confidence >> interval (+/- 4.6%). >> >> Rosalyn >> >> >> On Wed, Dec 5, 2012 at 1:52 PM, Ross Singer <[log in to unmask]> wrote: >> >>> Thanks, Rosalyn for setting this up and compiling the results! >>> >>> While it doesn't change my default position, "yes we need more diversity >>> among Code4lib presenters!", I'm not sure, statistically speaking, that you >>> can draw the conclusions you have based on the sample size, especially >>> given the survey's topic (note, I am not saying that women aren't >>> underrepresented in the Code4lib program). >>> >>> If 83% of the mailing didn't respond, we simply know nothing about their >>> demographics. They could be 95% male, they could be 99% female, we have no >>> idea. I think it is safe to say that the breakdown of the 16% is probably >>> biased towards females simply given the subject matter and the dialogue >>> that surrounded it. We simply cannot project that the mailing list is >>> 57/42 from this, I don't think. >>> >>> What is interesting, however, is that the number roughly corresponds to >>> the number of seats in the conference. I think it would be interesting to >>> see how this compares to the gender breakdown at the conference. >>> >>> This doesn't diminish how awesome it is that you put this together, >>> though. Thanks, again to you and Karen! >>> -Ross. >>> On Dec 5, 2012, at 1:28 PM, Rosalyn Metz <[log in to unmask]> wrote: >>> >>>> Hi Friends, >>>> >>>> I put together the data and a summary for the gender survey. Now that >>>> conference and hotel registration has subsided, it's a perfect time for >>> you >>>> to kick back and read through. >>>> >>>> [Code4Lib] Gender Survey >>>> Data< >>> https://docs.google.com/spreadsheet/ccc?key=0AqfFxMd8RTVhdFVQSWlPaFJ2UTh1Nmo0akNhZlVDTlE >>>> >>>> Gender Survey Data is the raw data for the survey. Not very interesting, >>>> but you can use it to view my Pivot Tables and charts. >>>> >>>> [Code4Lib] Gender Survey >>>> Summary< >>> https://docs.google.com/document/d/1Hbofh63-5F9MWEk8y8C83heOkNodttASWF5juqGLQ1E/edit >>>> >>>> Gender Survey Summary is easy to read version of the above -- its the >>>> summary I wrote about the results. Included is a brief intro, charts >>> (from >>>> above), and a summary of the results. >>>> >>>> Let the discussion begin, >>>> Rosalyn >>>> >>>> P.S. Much thanks to Karen Coyle for reviewing the summary for me before I >>>> sent it out. Also if there are any typos or grammar mistakes, please >>> blame >>>> my friend Abigail who behaved as my editor. >>> >> >>