Should we regard Repository as yet another noun of uncertain parentage that is searching for meaning. R is for Repository in the same way that P was for Portal, O was for Ontology and M was for Metadata. N? Nothing just yet, or maybe N is for Network. Fortunately, the other nouns have had some prior usage. Repository has less common usage except as a place to store furniture.
Personally I've always preferred verbs, as these almost automatically put the focus on actions (or states of being), on tasks of actors, as essential links in the subject/verb/object triple.
It is not self-evident to me whether Repository is a new label to describe something(s) that have existed for a while or something(s) that have come into existence to justify the term. In the early 1980s I worked with the Scottish Education Data Archive which was a collection of survey datasets (on related topics, and generated by a research centre in a university over time), then in the mid to late 1980s and 1990s I worked with Edinburgh University Data Library which was both a collection of user-contributed datasets and a collection of third-party published datasets. In both instances the purpose of the 'data archive' and 'data library' was to provide access to those datasets to them that wants them. For both, there was some forward thinking, in that we collected and curated ahead of demand: for example we took in user-contributed digitised boundaries of a particular geography before we knew anyone would re-use them. In the late 1990s and since I have worked on a variety of online services which depend upon the management of databases of data objects, datasets and datastreams that others (not me) have created, although these are mostly not 'user- or community-generated'. We did not call them repositories at the time - or not until quite recently.
Jorum is a national repository of learning materials, devised and developed by staff at EDINA and Mimas in response to expressed requirements to keep stuff safe, and to enable and facilitate sharing. What makes it a repository? Its a database that we call a repository. Why? Because that was the term of the moment and was and is understood within a certain 'designated community', but not much beyond. When thinking about Jorum, about the repository built for GRADE, and for the store of datasets used for eMapScholar, and then for the Depot (in the Prospero project) we thought that Cliff Lynch's statement that "a university-based institutional repository is a set of services" needed re-phrasing: a repository is a managed database that supported three (or more) services, necessarily including deposit (ingest), keep-safe, access (download). But any decently managed database does that surely?
M2M access, by API (and OAI-PMH) has been put up as a necessary characteristic of a repository, but that m2m access has been commonplace for many services from EDINA and Mimas, and again is that not just what we would want from any managed network-accessible database?
Digimap is built upon a range of databases, some populated by data from the Ordnance Survey, some by derived data (value added, curated by EDINA) and now also some contributed by users.
I confess I am at a loss to understand what is distinctive about a repository. Except perhaps, that the attention should focus on the quality and nature of the service that is delivered to the (potential) depositor. Understanding why someone wants to deposit (share) something, and what would constitute reward (in terms of happiness not just lack of pain) for the act of depositing is hard, elusive and novel. We are examining how to make the Depot into a service for happy putting, so too with Jorum. The motives for sharing differ, as does the nature of the workflow during which 'deposit' could be considered natural.
Now B is for Bucket: must it hold objects as well as liquid, must it provide means by which things can be poured into it, as well as out? Is there a hole in the bucket, does it have to have a handle, what if there was a spout?