In the case of xml, I think xpath is the simpler tool.
---- Brian Zelip wrote ----
Hi Matt.
Re: finding words in all caps, yes it's possible. See this SO answer to
help: http://stackoverflow.com/a/4255225/2145103
Re: italics, my hunch is that you could do so if you got hold of the xml
behind the word doc, which I'd assume would have something like an
`<italic>` tags or attribute values of `italic` in the markup.
good luck!
Brian Zelip
---
Emerging Technologies Librarian
Health Sciences & Human Services Library
University of Maryland, Baltimore
[log in to unmask]
410-706-8865
On Tue, Jul 7, 2015 at 11:56 AM, Matt Sherman <[log in to unmask]>
wrote:
> Hi all,
>
> I am working my way through teaching myself regex to parse an annotated
> bibliography docx file and had a question as I can't seem to get a succinct
> answer from Google. Is it possible to have regex find words, or in the
> case names, in displayed in all caps? Also similarly is it possible to
> have regex find words, or in this case titles, that are italicized? Given
> how the document is formatted doing both would be nice so that I could
> parse them into a table or or database, but I cannot find a clear answer on
> that, though I am very new to regex so it is probably jumping into the deep
> end on this. Any answers are appreciated.
>
> Matt Sherman
>
|