The AI4LAM Speech-to-Text Working Group requests feedback on a draft specification for Transcript Provenance Metadata Elements (TPME).
Motivation: Many libraries, archives, and museums create and use transcripts associated with audiovisual resources. Transcripts come from various sources, and they vary in quality and other characteristics. For example, if a given transcript was created by Whisper, one would like to know: With which model? Using what parameter values? Has it been corrected by a human? According to what conventions? TPME is intended to capture this kind of data.
If you are part of an organization that needs this kind of data, we invite your feedback on the draft available here:
Https://docs.google.com/document/d/10hEbp_RkOeSm5uorTlI_QmyE0sD1G5OAkWUfHICf8W0
Several options for offering feedback:
- Add comments directly to the Google Doc
- Post to the #speech-to-text channel in the AI4LAM Slack here: https://ai4lam.slack.com/
- Send email to [log in to unmask]
We will be discussing this draft specification, at least one current implementation, and the roadmap to a recommendation, at the next monthly call of the AI4LAM Speech-to-Text Working Group on Tuesday, May 27, 2025 at 16:00 UTC. For information about the call, please consult our running notes document at https://docs.google.com/document/d/1lUI1l_cfJ-hM7ZXgfITyjcevUxFfc0C6HRzhc_Ui8bU
With kind regards,
Owen King
and the organizers of the AI4LAM Speech-to-Text Working Group
Owen King
Metadata Operations Specialist
|