Metadata Specifications
Derived from the Functional Requirements:
A Reference Model for
Business Acceptable Communications
In conformity with the functional requirements for evidence, we
assert
that evidence can only be made by compliant organizations using
responsible, implemented and consistent recordkeeping systems.
Records
captured by such systems must be comprehensive, identifiable,
accurate,
understandable, meaningful and authorized. They must be
maintained
inviolate, coherent, auditable and removable. And to be used they
must be
available, renderable, evidential, exportable and redactable.
In addition to satisfying the requirements for evidence, business
acceptable communications must carry metadata to satisfy the
requirements
of large scale, distributed implementations over long periods of
time
during which human memories of the contexts of creation will not
suffice
and software and hardware will have significantly changed.
The following reference model proposes a six layer structure of
metadata:
designed to satisfy the functional requirement for evidence and
the
requirements of business acceptable communication and support the
effective management of any record over long periods of time.
For more information, please refer to the following papers
Metadata Requirements for Evidence, by David Bearman,
Archives
& Museum Informatics, and Ken Sochats, University of Pittsburgh, School of Information
Sciences
Item Level Control and Electronic Recordkeeping, by David
Bearman, Archives & Museum Informatics, DRAFT of a paper given
at the 1996 Society of American Archivists Meeting in San Diego, CA, August 29,
1996.
I. HANDLE LAYER
Declares the data that follows to be a record, assigns values
indicating the provenance of the record, and
provides terms by which the contents of the record can be
discovered.
I.A. Record Identification Metadata (Not
Repeatable)
Consists of a unique identifier made up of three data
elements (Record-Declaration,
Transaction-Domain-Identifier,
Transaction-Instance-Identifier).
I.A.1. Record-Declaration [Mandatory]
Identifies the data as a record. This data
element consists of a bit stream asserting that
what follows is a record. The presence of the record declaration
can be determined without opening the
record, but if the record is opened it loses this value.
I.A.2. Transaction-Domain-Identifier
[Mandatory]
Uniquely identifies the domain from which the
record originated with sufficient
specificity to identify the transaction-type and the organization
responsible.
I.A.3. Transaction-Instance-Identifier
[Mandatory]
Uniquely identifies a transaction instance with date, time and
necessary sequence identifiers.
I.B. Information Discovery Content Metadata
(Repeatable) Provides descriptors considered
necessary to retrieve the record at a later date.
I.B.1. Content-Description-Standard [Optional,
except in cases of privacy act defined content]
Identifies standards governing
content-descriptors. Privacy controlled content must be
identified according to privacy act standards.
I.B.2. Content-Descriptor [Optional]
Provides terms used by the office of
origin/receipt to describe or index the
record.
I.B.3 Record-Natural-Language
[Optional]
Identifies the natural; language of the record
(e.g. English, French,
Portuguese)
II. TERMS & CONDITIONS
LAYER
Invokes controls over access to, and use and disposition of a
record. Identifies restrictions imposed on
access and use and where to resolve them.
II.A. Restrictions Status Metadata
Identifies whether any restrictions must be resolved
before permitting access, use, or
disposition.
II.B. Access Conditions Metadata
(Repeatable)
Identifies the conditions for access to the record and
how
to satisfy them.
II.B.1. Access-Conditions-Resolver [Mandatory for
records with access restrictions]
Identifies any resolvers that must be satisfied
access requester meets conditions
regarding payments, permissions, proof of identity or other
restrictions on access.
II.B.2 Resolver-Terms [Mandatory for records with
access restrictions]
Defines terms for access in a way that is
recognized by the resolver.
II.C. Use Conditions Metadata (Repeatable)
Identifies the conditions for use of the record and how
to
satisfy them.
II.C.1. Use-Conditions-Resolver [Mandatory for
records with use restrictions]
Identifies the resolvers that must be satisfied
user meets conditions imposed on use and
that the recordkeeping system is notified of how to impose such
restrictions.
II.C.2 Use-Terms (Mandatory for records with use
restrictions)
II.C.2.a. Use-Citation
[Optional]
Consists of textual information
supplied
by the creator or owner of the record
detailing limitations on use.
II.C.2.b. Redacted-Record-Rule
[Mandatory if content view must be
restricted]
Identifies views that are permitted to
different users. It may be executed
algorithmically or may require human intervention to produce a
releasable view.
II.C.2.c. License-Terms [Mandatory
for Licensed Data]
If the data is licensed, this data
enables the proper resolution of use of the
record according to the guidelines set by the
license.
II.D. Disposition Requirements Metadata (Not
Repeatable)
Identifies the conditions regarding retention and
disposition of the records according to
policy.
II.D.1. Removal-Authority [Mandatory]
Identifies under whose/what authority a record
(whole or in part] may be purged from
the system. The identification of this authority resides with the
record and is established at the time of
the record's creation.
II.D.2. Retention-Policy-Citation
[Mandatory]
Comprised of textual information identifying the
organization's internal policy/policies
for record's retention - indicates the specific policy (s)
governing retention and links to authority
issuance.
II.D.3. Retention- Authority Issuance[Optional;
unless retention-period-end-time is
unspecified]
Comprised of textual information regarding the
legislative or governmental
law(s)/regulation(s) governing record retention (ex. Code of
Federal Regulations) - indicating the
specific legal/regulatory policy number, version, dates issued,
dates effective, etc.
II.D.4. Retention-External -Authority [Optional;
unless retention-period-end-time is
unspecified]
Comprised of textual information identifying the
issuing organization that has
jurisdiction over the law(s)/regulation(s) governing records
retention.
II.D.5. Retention-Period-End-Time
[Mandatory]
Indicates scheduled retention period end date
(mmddyyyy) for the record. This
information is determined at the time of the record's creation.
If
unspecified (frequently indicated as
(99999999), the record must contain citations to policy,
regulation
and authority (II.D.2-4)).
II.D.6. Disposition-Instruction-Code
[Optional]
Identifies the methods that apply to the
ultimate
disposition of the record.
III. STRUCTURAL LAYER
Consists of metadata about data structure designed to permit the
record to remain evidential over time
and to be migrated to new software and hardware dependencies as
necessary.
III.A. File Identification Metadata (Repeatable for each
file)
Enables the identification of individuals files that
comprise the record and affords the ability to
verify their authenticity.
III.A.1.File-ID [Mandatory]
Identifies each file that makes up the record.
This affords the ability for the system to
bring together all of the parts to form the whole.
III.B. File Encoding Metadata (Repeatable for each
file)
Identifies the encoding pertinent to the individual
files
that comprise the record.
III.B.1.File-Modality [Mandatory]
Identifies the file modality (i.e. text,
numeric, graphic, geographic, image, sound,
video, multimedia etc.).
III.B.2.File-Data-Representation
[Mandatory]
Identifies the data encoding standards used by
the
file (i.e., ASCII, EBCDIC, or
UNICODE character data, ASN.1, CCITT Group III raster,
etc.)
III.B.3.Data-Codes [Mandatory if non-standard
methods of representation are used]
Indicates specifically how the data is encoded
when registered methods are not being
used. For example, for vector data whether it is topological,
spaghetti, chain-node, etc., for raster data
the number of dots per inch and their bit density, for sampled
data
the number of samples per second,
etc.
III.B.4.Compression-Method [Mandatory]
Identifies the method of compression, if any,
that
was used (ex: None, JPEG, MPEG,
Quicktime, LZW, etc.). If the method complies to a specific
standard, this may consist of only the
identification of that standard (name, version, etc.),
otherwise the method may need to be defined in
technical detail.
III.B.5.Encryption-Method [Mandatory]
Identifies the algorithms used by the record
originator to encrypt the record's content.
All records are stored in the de-encrypted form in which they
would
have be read by
recipients.
III.C. File Rendering Metadata (Repeatable for each
file)
Identifies how the record appeared in order to recreate
it
as it would have been viewed at the
time of receipt.
III.C.1.Application-Dependency [Mandatory,
Repeatable]
Indicates which applications, if any, the record
is dependent upon. If there are
dependencies, the name of one application, the version, and
registration information is recorded in each
occurrence of the field at the time of record creation. This
information is intended to serve as a pointer
to a registered library maintained by the creating organization
or
a public entity such as the Copyright
Office or Patent Office.
III.C.2.Software-Environment-Dependency [Mandatory
- Repeatable]
Indicates what software, including operating
systems and API's, if any, the record is
dependent upon. If there is a dependency, the name of the
software
package(s), the version, registration
information, and display information (such as font sets or
other
software dependent attributes] is
recorded at the time of record creation.
III.C.3.Hardware-Dependency [Mandatory -
Repeatable]
Indicates what hardware, if any, the record is
dependent upon. If there is a
dependency, the hardware needed, model number, configuration, and
output information (such as
printers or viewers required or other hardware dependent
attributes] are recorded at the time of record
creation.
III.C.4.Rendering-Rules [Mandatory -
Repeatable]
Identifies the procedures necessary to enable
the
record to be displayed, printed, or
otherwise represented as it had been at the time of creation
(macros, dimension, spatial reference data,
etc.) - may operate at different levels.
III.C.5.Representation-Standard/De Facto Standard
[Mandatory - Repeatable]
Identifies any standard(s) applied to the file
that affect how the file is rendered (ex:
SGML, Postscript, TIFF, etc.).and which version of the
standard
was used.
III.D. Record Rendering Metadata
Applicable to the record as a whole, once files have
been
correctly rendered according to their
own rule.
III.D.1.File-Linking-Rule/Standard
[Mandatory]
Identifies the rules or standards required to
enable the necessary linkages between files
that make up the record. Contains textual information regarding
the
actual rules or standards
applied.
III.D.2.File-Interchange-Standard: Version
[Mandatory]
Identifies the standard(s) (including
identifying the appropriate version) employed by
the record to enable file interchange.
III.E. Content Structure Metadata
Defines the structure of the contents of the
record.
III.E.1.Content-Structure [Mandatory]
Indicates whether the content of the record is
structured or unstructured.
III.E.2.Content-Data Set [Optional]
If the content is identified as being
structured,
this cites the data set which indicates
how it is structured. Consists of the actual name of the data
set
definition. If a data set definition is
neither registered or a well-known registered identity, then it
will need to be registered.
III.E.3.Application-Dictionary [Mandatory, if
structured and no content data set]
Identifies the data dictionary for the entire
database. This consists of the actual data
dictionary itself - or it could take the form of a set of
referential integrity controls.
III.E.4.Delimiters/Labels [Optional, good
practice]
Consists of the actual delimiters/labels used
throughout the data and their usage
rules.
III.E.5.Data Value-Lookup Tables [Mandatory, where
present - Repeatable]
Consists of the authority file containing the
values of the codes used throughout the
record and their usage rules.
III.E.6.Data View-at Creation [Mandatory, if
partial
view]
Identifies how the application viewed the record
at the time of the record's creation.
This is the redaction subset of the data dictionary.
III.E.7.Version-Relationships [Mandatory, if prior
version exists]
Consists of any Record-Identifiers of previous
versions of the record.
III.E.8.Set-Relationships [Mandatory, if other set
members exist]
Identifies the record as belonging, for business
purposes, to an overall set of records.
Can consist of the classification of that set, or the
Record-Identifier(s) of other records.
III.E.9.Dynamic-Relationships [Mandatory, if
higher/lower exists]
Identifies what data is required from other
records/files in order to populate other
values. This is active in set relationships where a record
cannot
be opened unless the contents of other
records are available.
III.F. Source Metadata
Identifies the source of the record and documents
relevant
circumstances of data
capture.
III.F.1.Data-Source [Mandatory]
Identifies the source that created the record;
eg.
to the recordkeeping
system.
III.F.2.Data-Source-System-Documentation
[Optional]
Identifies or consists of the documentation that
outlines the conditions needed to create
the record - contains information on the data processing
function.
III.F.3.Data Capture-Instrument-Type [Mandatory,
if
instrument captured source data]
Identifies the type of instrument was used to
capture the data (i.e. light recording,
sound recording, temperature recording, location recording,
etc.) and the specific instrument used
(manufacturer, model number, etc.).
III.F.4.Data Capture-Instrument-Settings
[Mandatory,
if instrument captured source
data]
Identifies the settings, calibration, etc. were
in
effect when the data was captured.
III.F.5.Source Data-Quality [Optional, good
practice]
Identifies the degree of reliability of the data
generated by the source.
IV. CONTEXTUAL LAYER
Identifies the provenance (i.e. the person, system, or
instrument that is responsible for generating the
record) of the record and provides data that supports its use
as evidence of a transaction.
IV.A. Transaction Context Metadata
Identifies the transaction of which this is a
record.
IV.A.1. Originator-Identification
[Mandatory]
Identifies the organization/person/system that
initiated the transaction and the time of
the transaction
IV.A.2. Recipient-Identification
[Mandatory]
Identifies either the office/person/system that
received the transaction and the time of
receipt.
IV.A.3. Copy-Identification
[Mandatory]
Identifies whether the copy encapsulated by the
metadata is the sender's or the
recipient's copy.
IV.A.4. Business-Transaction-Type
[Optional]
Identifies the type of transaction (its
business functional context].
IV.A.5. Business-Transaction Procedure Reference
[Optional]
Identifies the originating organization's
specific
policy/policies and/or procedure(s) (i.e.
business rules) governing this type of transaction. May
consist
of citations or of the actual
policy/policies and/or procedure(s). In either case it should
note
the relevant version, effective dates,
etc.
IV.A.6. Linked-Prior Transaction [Mandatory, if
applicable]
Identifies the Record-Identifier(s) for
transactions that are part of the same business
activity.
IV.A.7. Action-Requested [Optional, good
practice]
Identifies if an action was requested as a
result
of the transaction. Could enable links
to past transactions if they occurred.
IV.A.8. Recipient Specific-Configuration Data
[Optional, good practice]
Identifies the permissions and views that the
recipient would have had. May reference
the data dictionary.
IV.B. Responsibility Metadata
Identifies the organization, units and individuals
responsible for the recorded
transaction.
IV.B.1. Originating-Organization
[Mandatory]
Identifies the organizational unit engaged in
the
recorded transaction - from the legal
entity down to the specific office of origin.
IV.B.2. Authorization [Optional, good
practice]
Identifies the source of authorization for
specific office(s)/position(s)/individual(s)
authorized to engage in the identified transaction.
IV.C System Accountability Metadata
Certifies the procedures and systems logs of the system
during the period of
operation.
IV.C.1 System Audit-Responsible [Mandatory]
Citation to most recent system and procedure
audit
transactions which contains
evidence of the system being responsible.
IV.C.2 System Audit-Implemented [Mandatory]
Citation to most recent system and procedure
audit
transactions which contains
evidence of the system being implemented.
IV.C.3 System Audit-Consistent [Mandatory]
Citation to most recent system and procedure
audit
transactions which contains
evidence of the system being consistent.
V. CONTENT LAYER
Contains the actual data engaged in the transaction.
VI. USE HISTORY LAYER
Documents evidentially significant uses of the record subsequent
to
creation; typically these will include
indexing, redacted releases, and record disposition/destruction
under record retention authority, but other
uses (for eyes only viewing, etc.] may be recorded. This
layer occurs at the end of the physical record
to permit adding of entries without having to open the
record.
VI.A. Use History Metadata (Repeatable)
Identifies the history of use of the record - the type
of
use, when it was used, and by whom.
Also indicated any redactions of the data.
VI.A.1. Use-Type [Mandatory]
Identifies how the data was used: viewed,
copied,
edited, filed, indexed, classified,
sent, disposed, etc. This involves identifying the various types
of use permitted by the
system.
VI.A.3. Use-Instance-Time [Mandatory]
Identifies when the data was used - i.e. the
date
and time the data was
used.
VI.A.4. Use-Instance-User [Mandatory]
Identifies who or what used the data on a given
date at a given time.
VI.A.5. Use-Evidential Consequences [Mandatory if
redacted on release]
Identifies the impact of a particular use
(for
example, may identify the part of the
record released, the terms used in indexing, the importance of a
specific view what part of the record
was viewed).
* Note: Although it is possible to conduct a transaction
that adds no new data content to existing
records (e.g., only forwards pre-existing material, without so
much as a cover note], and it is possible to
have transactions which do not incorporate previously existing
records, it is not possible to have a
transaction without any content. Thus the "Record" cluster is
mandatory, although the metadata items in
it are both optional. The "Content" level is therefore also
mandatory.
|