Print

Print


Le 12 mai 2022 20:44:22 GMT+01:00, "Hammer, Erich F" <[log in to unmask]> a écrit :
>Danielle,
>
>.DOCX files are just a collection of zipped xml and image files.  You can see this by changing the extension (on a copy) on the file and then exploring.  It should be possible to parse out the data from the XML file(s) and build a structure from it.

Yes, the key one is document.xml but it is very noisy and seems only
semantic if the author used styles instead of bold, italics and so on.

--
MJR
https://www.software.coop