I have had several theoretical changes of opinion on this question, and
have come to the considered opinion that there is no principled *essential*
difference between Metadata and Data. It all depends on the
context/theory/background assumptions to which the data is being applied.
The property of Data being meta is entirely use sensitive. The property of
being information may depend upon the existence of metadata referring to
For example, it is labeling of an antelope in a zoo as "an antelope" that
turns an ungulate into a document; data measured from this beast gives us
evidence about what "an antelope" is like.
The label & number of the beast, as well as the date of capture and other
provenance, are clearly metadata in this case, and provide the context for
interpreting the data as information, and for assessing the degree of
justification we have for treating this information as knowledge. However,
in other cases, the metadata may serve as data for other studies, with no
reference to our four legged friend.
Suppose we are doing a study on the rate of differently labeled specimen
acquisition in zoos across Europe over the course of the 19th and 20th
centuries. In this situation, what was metadata has become our primary
data; *our* metadata relates to the provenance of the descriptions.
Metadata embedded by a smart sensor package included in the same persuade
as the data gathered as part of an observation run is essential to the
interpretation of that data as information. However, it is not the primary
data itself; it is the context. Radar data from early JSTARS platforms was
severely downgraded by rain between the platform and the ground; the
information provided needs context about climate conditions in order to
determine the actual amount of information obtained when fusing that
information with other sensor systems. However, the climate readings are
not part of the radar data itself.
So, to sum up, it depends; Further Research Is Needed; one man's Meta is
another man's Poisson.
On Feb 14, 2012 9:59 AM, "Michael Hopwood" <[log in to unmask]> wrote:
> Having done research, and now working in a very varied metadata role, I
> don't quite understand this discussion about data that is or isn't
> metadata. Scientific data is a great example of structured data, but it's
> not impossible to distinguish it from metadata purely describing a dataset.
> However, if you have scientific research data created during the
> experiments, even if it's "operational", it's clearly part of "the" data.
> This doesn't mean there can't be metadata describing *that data*. Just
> because it's not glamorous data doesn't mean it's not essential to the
> scientific process. Similarly, just being about mundane or procedural
> things doesn't make data into metadata...!
> You're absolutely right, the contextual information is certainly part of
> the experimental outcome in this example; otherwise it would be abstract
> data such as one might use in a textbook example.
> Metadata would describe the dataset itself, not the scientific research.
> There's always a certain ambiguity involved in identifying "the data" as
> distinct from the metadata, and it's a false dichotomy to suggest metadata
> is not useful at all for the domain expert. It's contextual, and the
> definition is always at least partly based on your use case for the data
> and its description.
> -----Original Message-----
> From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of
> Nate Vack
> Sent: 14 February 2012 14:45
> To: [log in to unmask]
> Subject: Re: [CODE4LIB] Metadata
> On Tue, Feb 14, 2012 at 1:22 AM, Graham Triggs <[log in to unmask]>
> > That's an interesting distinction though. Do you need all that data in
> > order to make sense of the results? You don't [necessarily] need to
> > know who conducted some research, or when they conducted it in order
> > to analyse and make sense of the data. In the context of having the
> > data, this other information becomes irrelevant in terms of
> > understanding what that data says.
> It is *essential* to understanding what the data says. Perhaps you find
> out your sensor was on the fritz during a time period -- you need to be
> able to know what datasets are suspect. Maybe the blood pressure effect
> you're looking at is mediated by circadian rhythms, and hence, times of day.
> Not all of your data is necessary in every analysis, but a bunch of blood
> pressure measurements in the absence of contextual information is
> universally useless.
> The metadata is part of the data.