In re Github facets - some projects do use labels to specifically indicate
issues that may be beginner-friendly. The specific terms they use vary but
you can find a searchable/filterable aggregation at http://up-for-grabs.net/
. Also OpenHatch is a friendly community aimed at getting people involved,
and they have a place you can search for issues that might be relevant
(e.g. https://openhatch.org/search/?language=Python&q= ).
On Fri, Nov 3, 2017 at 9:29 AM, Julie Swierczek <[log in to unmask]>
wrote:
> Dan,
>
> This. Is. Awesome. And your interpretation of the date string "1933,
> 1937-1938, 1941" is correct - I meant to say it should be 1933/1941. This
> sort of error is exactly why I wanted to approach this programmatically,
> and not type the dates by hand. I used student employees to copy the data
> from the HTML pages into spreadsheets, and to check for spelling errors.
> However, I didn't want to use students to type the dates. I feel like that
> would be risking the creation of too much metacrap. I can't even type them
> correctly myself, so I can't expect students to have 100% accuracy, either.
>
> Also, for anyone else following from home, I have to say why I love this
> solution compared to all the others.
>
> 1) I have over 400 spreadsheets, some with over 1000 lines. While I
> *could* use OpenRefine or Excel for a certain amount of date cleaning, that
> assumes I am interested in - and have the time for - opening each file
> individually and working on the dates one spreadsheet at a time. I can set
> this script up to run through a bunch of csv files. I don't need to look at
> them. (And, yes, I know how to set up a task in OpenRefine and save it and
> use it again later - and I was working on building one of those - but that
> is more time consuming than I want this task to be.)
>
> 2) This doesn't' use Ruby or perl or other tools that I don't know and
> don't have time to use now. I said I can handle basic Python, and that's
> what this is.
>
> 3) This is written simply and clearly, and doesn't do too much of 'let's
> prove how awesome I am by using as few lines of code as possible', which is
> really hard for newbies to interpret and change. (You know what I'm
> talking about - something that a newbie would write in 200 lines and
> someone else says, "Yeah, you idiot, I can do that in two lines". Cf. ALL
> OF STACK OVERFLOW.)
>
> 4) Building on point number 3, this is written simply and clearly enough
> that I can figure out how to modify it further if I come across any other
> date cases that I haven't discovered so far. I would even feel confident
> enough to submit a pull request if I do develop solutions for other date
> formats for this.
>
> 5) Further, this is written simply and clearly enough that I can use this
> as a model for figuring out how to write other Python stuff to handle other
> similar tasks. This is now my favorite thing in all of GitHub. (I wish
> GitHub had a special facet for 'newbie friendly' stuff. I know that is
> somewhat subjective, but I can't tell you how many 'easy' tools that have
> been recommended to me that would take me roughly a week to figure out how
> to run once, and possibly another month of trying to troubleshoot error
> messages to get it to actually work. Cf. http://tpverso.com/an-open-
> letter-to-open-source-projects-for-lams/)
>
> I again want to thank Dan for this code and I also want to commend it to
> everyone else's attention as the sort of code that is really friendly to
> newbies. If you are thinking of writing a tool and you want to be able to
> share it with institutions of all sizes, with a really low barrier to entry
> (e.g., the knowledge of how to put a .py file in a directory, change the
> filename in the .py file, and then run 'python test.py'), then this is a
> good model of how code should be written. Also, while I am on my soapbox,
> here's a great model for documentation: https://github.com/
> CarletonArchives/BagBatch.
>
> Thus Endeth the Lecture.
>
> Dan, thanks again. This just made my semester.
>
> Julie Swierczek
> Transformer of Dates
>
--
Andromeda Yelton
Senior Software Engineer, MIT Libraries: https://libraries.mit.edu/
President, Library & Information Technology Association: http://www.lita.org
http://andromedayelton.com
@ThatAndromeda <http://twitter.com/ThatAndromeda>
|