Keeping the data open
Something that I hadn’t previously considered for the proposed Books Rights Registry (BRR) in the Google Book Search settlement was the role that it might play in imposing data manipulations and metadata enrichment, and then subsequently allocating their costs — whether real or transactional — among subscribers, consumers, and other participants (including authors and publishers).
For example, theoretically, the BRR has the ability to impose ISBNs and other identifiers on orphan works that lack them. ISBNs could be provided in a special grant by EDItEUR for this purpose, with the costs apportioned among rightsholders and consumers.
Some of these data manipulations would be considered useful enhancements, but there also exists the possibility that they could be enacted without adequately broad engagement or consideration of the data flow and management issues among publishers, libraries, and consumers, particularly as they evolve into the future. A related risk is that non-traditional, but potentially higher-return options, might not be endorsed. It makes the necessity for coordinating stakeholders in metadata issues among all of these communities increasingly critical.
There is currently no mechanism in the BRR for community coordination to be imposed. There is mention in the settlement only of an advisory board, but not only has it not been named, it currently has no power of compulsion. I believe there is a default to good-will in this area, but it would be better for conversations among parties to be coordinated openly.
Additionally, these concerns strongly agitate for the continued maintenance and availability of open data for bibliographic and rights data, such as the Internet Archive’s Open Library (OL). Although OL has been poorly accessible through APIs, the necessity of encouraging the unimpeded flow of descriptive data without use restrictions is vital for maximizing the continued evolution of books and publishing, and our understanding of how and what we read. In conjunction with the OL, ensuring the availability of all known rights attributes associated with an object, particularly as a package, will be requisite. These data should be accessible in XML for easy machine consumption, and not provided only through user driven search interfaces.
One of the worries of the settlement is that there are a great many potential new sources of leverage and engagement that could leave even current stakeholders in the agreement uncomfortable with the consequences.