LISTSERV 16.5 - CODE4LIB Archives

The following application may be useful for your task.  I created this
application at the National Archives.  The team that I was on used this
application for a number of file system analysis tasks.

https://github.com/usnationalarchives/File-Analyzer

This application will allow you to select a recipe to use when crawling a
file system.  The recipe that you select will determine the type of report
that will be generated.  Once the report is generated, you can filter and
sort for information of interest.  Essentially, the application converts
the tree structure of the file system into a table structure.  The table
structure seemed to simplify decisions about a complex file hierarchy.

Terry

On Thu, Aug 30, 2012 at 6:02 PM, Shearer, Timothy J
<[log in to unmask]>wrote:

> Hi Folks,
>
> My query may have been poorly expressed...
>
> What we have is a webserver with 64,665 files (html, css, js, jpg, you get
> the idea) and lots of directories with subdirectories.
>
> The goal is to be able to conveniently take all that in in a way that
> makes it pretty simple to see/navigate (say for a public services staff
> member tasked with doing a survey of the old content) so that we can get a
> handle on what's there (prior to say, moving from a php+html template
> approach to a CMS).  It's about exploring the website from under the hood.
>
> In my limited imagination it might look like: the document tree
> represented in xml as viewed through a web browser.  Expanding/contracting
> nodes (and being able to recursively explode the view at at any node).
> Maybe choose to hide things like image, css, and js files.  Annotation
> would be lovely (say at a subdirectory be able to say: "this one's old and
> needs to go", "this one we keep as is", "this one needs to be reworked
> entirely").  And in an ideal world state could be preserved...if you'd
> expanded/contracted chunks as you were exploring, you could come back
> later and be where you were in your exploration.
>
> tree expresses the file system as (strangely enough) a tree, but the
> output is not interactive and it's huge and unwieldy to deal with.  If you
> find a subdirectory that's full of thousands of files that are irrelevant
> to the task of getting a handle on the overall content, they're on the
> screen and you page and page down and eventually lose track of where they
> are in the directory hierarchy.
>
> I'm more interested in how other shops help users understand a huge old
> webserver's content than focusing on a specific tool such as the one my
> brain imagines.
>
> Thanks for the feedback so far!
>
> Tim
>



-- 
Terry Brady
Applications Programmer Analyst
Lauinger Information Technology
202-687-7053