Outside the library sector, the most common approach to language tagging and matching isn't ISO-639-2 or ISO-639-3, rather BCP-47. Quite a number of ISO-639-2 language tags represent what ISO-639-3 refers to as macro languages. For instance 'kar' in ISO-639-2 resolves to 20 language codes in ISO-639-3 But ISO-639-3 by itself isn't sufficient to fully identify a written language. Eg you could have sr-Cyrl for Serbian in the Cyrillic script. Sr-Latn to represent Serbian written in the Latin orthography, sr-Latn-alalc97 ... Romanised Cyrillic Serbian based on the ALA-LC Cyrillic romanisation table published in 1997. Its worth noting the only ALA-LC romanisation tables that can be specified in BCP-47 are the 1997 editions. Ultimately it is what a library is working on, if you are cataloguing then all you have is ISO-639-3/B If you are working on a digitisation or linked data project it is much better to correctly use BCP-47 which would align your resources more accurately with the rest of the broader information ecosystem in which your resources would exist. Andrew On 2 Jun 2016 9:15 am, "Craig Franklin" <[log in to unmask]> wrote: > We've never had any problems sticking to ISO639-2 codes (in cases there > isn't a shorter ISO639-1 code available). I'm interested in what sort of > regional languages you might be dealing with where there are significant > gaps in that standard? > > You might also look at ISO 639-3, which is quite comprehensive but also > introduces a fair chunk of complexity: > > http://www-01.sil.org/iso639-3/download.asp > > Cheers, > Craig Franklin > > On 2 June 2016 at 08:59, Greg Lindahl <[log in to unmask]> wrote: > > > Some of the Internet Archive's library partners are asking us about > > language metadata for regional languages that don't have standard > > codes. Is there a standard way of dealing with this situation? > > > > Overall we use MARC codes https://www.loc.gov/marc/languages/ which > > were last updated in 2007. LOC also maintains ISO639-2 > > https://www.loc.gov/standards/iso639-2/php/code_list.php last updated > > in 2014. > > > > The languages in question are regional languages which are currently > > lumped together in both standards. With the recent rise in interest > > and funding for regional languages, it's no surprise that some > > catalogers want to split these languages out into separate codes. > > > > Thanks! > > > > -- greg > > >