Print

Print


On 09/06/12 06:36, Kyle Banerjee wrote:

> How do you guys deal with large XML files?

There have been a number of excellent suggestions from other people, but 
it's worth pointing out that sometimes low tech is all you need.

I frequently use sed to do things such as replace one domain name with 
another when a website changes their URL.

Short for Stream EDitor, sed is a core part of POSIX and should be 
available pretty on much every UNIX-like platform imaginable. For 
non-trivial files it works faster than disk access (i.e. works as fast 
as a naive file copy). Full regexp support is available.

sed 's/www.example.net/example.com/gI' < IN_FILE > OUT_FILE

Will stream IN_FILE to OUT_FILE replacing all instances of 
"www.example.net" with "example.com"

cheers
stuart
-- 
Stuart Yeates
Library Technology Services http://www.victoria.ac.nz/library/