6, Define repository as part of the user’s (author/researcher/learner) workflow
It is important to take account of user's workflows when defining a repository so it is not considered a system that is removed from the users daily routine.
It is important to take account of user's workflows when defining a repository so it is not considered a system that is removed from the users daily routine.
Interoperability needs to be motivated by service requirements, not fetishized as an end it itself.
As part of our work to "examine the feasibility of approaches to improve the consistency with which repositories share material", we are looking at this in regard to 3 areas: metadata (this idea), the materials themselves and descriptions of repository policies (e.g. on IPR) [materials and policies appear as separate ideas].
When we use the term repository in the context of JISC(and other repository networks) essentially it means making content (in our case produced as part of research, learning and teaching) available over the network so it can be shared and used. But the word doesn’t say that. The word says store. We should be saying what we mean. We should really be talking about making content available on the web? And if concerned with ...more »
When we use the term repository in the context of JISC(and other repository networks) essentially it means making content (in our case produced as part of research, learning and teaching) available over the network so it can be shared and used. But the word doesn’t say that. The word says store. We should be saying what we mean. We should really be talking about making content available on the web? And if concerned with preserving content talk about doing that etc. The term repository has almost become meaningless because so many uses and functions are bundled up together under that term.
« less full details »
The current repository technology is library/cataloger centric: items are uploaded (usually by a cataloger, not the author), and most of the meta-data is added by a subject specialist. In this model, the author-as-depositor is (at best) just an initiator for a deposit process. A better solution would be to move towards a Combined Research Information System [CRIS], where the academic can organise their areas of interest ...more »
The current repository technology is library/cataloger centric: items are uploaded (usually by a cataloger, not the author), and most of the meta-data is added by a subject specialist. In this model, the author-as-depositor is (at best) just an initiator for a deposit process.
A better solution would be to move towards a Combined Research Information System [CRIS], where the academic can organise their areas of interest [AOI]; see the research grants they have (and associate them with their AOI); lodge keep-safe copies of work-in-progress, data-sets, talks, ideas for future work, posters, etc (and associate them with grants or AOIs).
From this corpus of data, the academic can indicate what is visible locally (within the research group/department/organisation) and what is available globablly... and from that "globally available" pool, an "Institutional Repository" can be assembled.
The big advantages of a system like this is that the user only needs to define the meta-data specific to that object (an AOI has a title and a description, and inherits a creator from the CRIS; an article has a title and an abstract, but also inherits data from the associated grant and/or AOIs) - this is a much smaller "keystroke" barrier (or whatever you call that "I don't want to enter lots of metadata" problem)
« less full details »
Definition should not make assumptions as to implementation architecture i.e. whether deposited collection(s) held at institutional or network level
This is the Andy Powell worry; we have made the repository too much of a "special thing" operating under "library rules". Make it more like Slideshare. I'm going to express this another way...
People who might create services from repository-based information
will be looking for simple human-readable information on the policies,
formats and metadata used by repositories. This is as important as
creating machine-readable interfaces.
The changes in technology, the diversity of cataloguing practice,
the diversity of ownership and legal considerations and the
possibilities for metadata to be created remotely all mean that
acceptable and achievable recommendations for consistency between
repositories are likely to be broad principles with examples of good
practice rather than prescriptive rules or precise recommendations.
With acknowledgement for this idea to Owen Stephens' recent Tweet. My interpretation of this idea is that 'repositories' are best viewed as a 'type' of data store supporting a variety of services, embedded in various workflows. This fits nicely with Paul Walk's concept of a 'source repository' (see http://tiny.cc/FIHwc) being a simple system with complexity moved to specialised services. I suppose this approach isn't ...more »
With acknowledgement for this idea to Owen Stephens' recent Tweet. My interpretation of this idea is that 'repositories' are best viewed as a 'type' of data store supporting a variety of services, embedded in various workflows. This fits nicely with Paul Walk's concept of a 'source repository' (see http://tiny.cc/FIHwc) being a simple system with complexity moved to specialised services. I suppose this approach isn't that far removed from the original OAI concepts of data provider and service provider, though the focus there was on access whereas now we are considering a wider context for repositories..
« less full details »
If the repository is to become anything other than a final destination for public objects, then the user needs control over access. This control must be able to ALLOW access to the objects by colleagues, wherever they work, as well as prevent access by others.
I guess this is the workflow idea again, but stated another way. Don't get too hung up on "workflows", as in the e-science meaning (kepler, taverna et al). This is about making the repository fit in what people are trying to do, eg write the article, keep multiple versions, share with their colleagues in other institutions...
Another from the Research repository System (RRS) blog posts: Publisher liaison is maybe controversial. But why shouldn’t the RRS staff (or your library) support you in dealing with publishers? The RRS wants your articles and your data, and should help you negotiate and reserve the rights so that they can get them. So publisher liaison would include rights negotiation, submission to the publisher on your behalf of a ...more »
Another from the Research repository System (RRS) blog posts:
Publisher liaison is maybe controversial. But why shouldn’t the RRS staff (or your library) support you in dealing with publishers? The RRS wants your articles and your data, and should help you negotiate and reserve the rights so that they can get them. So publisher liaison would include rights negotiation, submission to the publisher on your behalf of a specific version, support through the editorial revision process, and recovery of metadata from the published version for the RRS records and your own bibliography, web page and CV. Naturally, deposit in the repository would be integrated in this workflow; you only have to authorise opening to the public, or perhaps a more restricted audience.
« less full details »
The umbrella term "repository" conflates two very different kinds of services - services whose primary purpose is to preserve a type of media, and services whose primary purpose is to enable media to be shared and used by people. They don't look the same, they have different kinds of users and roles, they don't share the same concerns, and you use different language to talk about their features. Maybe we would get further ...more »
The umbrella term "repository" conflates two very different kinds of services - services whose primary purpose is to preserve a type of media, and services whose primary purpose is to enable media to be shared and used by people. They don't look the same, they have different kinds of users and roles, they don't share the same concerns, and you use different language to talk about their features. Maybe we would get further by having an amicable divorce, and only get together to talk about things that are completely generic, like storage.
« less full details »
OK, I'll go the whole hog in relation to the RRS blog posts: At a very basic level, the RRS should [be associated with] a Persistent Storage service. Completely agnostic as to objects, Persistent Storage would provide a personal, or group-oriented (ie within the institution) or project-oriented (ie beyond the institution) storage service that is properly backed up. There’s no claim that Persistent Storage would last ...more »
OK, I'll go the whole hog in relation to the RRS blog posts:
At a very basic level, the RRS should [be associated with] a Persistent Storage service. Completely agnostic as to objects, Persistent Storage would provide a personal, or group-oriented (ie within the institution) or project-oriented (ie beyond the institution) storage service that is properly backed up. There’s no claim that Persistent Storage would last for ever, but it must last beyond the next power spike, virus infection or laptop loss! It has to be easy to use, as simple as mounting a virtual drive (but has to work equally easily for researchers using all 3 common OS environments). Conversely (and this isn’t easy), there must be reliable ways of taking parts of it with you when away from base, so synchronisation with laptops or remote computers is essential. It should support anything: data, documents, ancillary objects, databases, whatever you need. It’s possible that “cloud computing” eg Amazon S3, the Carmen Cloud or other GRID services might be appropriate.
« less full details »
A repository should be for content which is required and expected to be useful over a significant period. It may host more transient content, but by and large the point of a repository is persistence. While suggesting a repository should be a "full OAIS" has not proved acceptable to this group so far, investment in a repository and this need for persistence suggest that repository managers should aim to make their content ...more »
A repository should be for content which is required and expected to be useful over a significant period. It may host more transient content, but by and large the point of a repository is persistence. While suggesting a repository should be a "full OAIS" has not proved acceptable to this group so far, investment in a repository and this need for persistence suggest that repository managers should aim to make their content both accessible and usable over the medium (rather than short) term. For the purposes of this exercise, let's suggest factors of around 3: short term 3 years, medium term around 10 years, long term around 30 years plus. Ten years is a reasonable period to aspire to; it justifies investment, but is unlikely to cover too many major content migrations.
To achieve this, I think repository management should assess their repository and its policies. Using OAIS at a high level as a yard stick would be appropriate. Full compliance would not be required, but thought to each major concept and element would be good practice.
This "idea" is to replace the "full OAIS" approach with something more realistic and achievable.
« less full details »
Again from the RRS blog posts: We don't think about identity management as part of the repository, although a really annoying early experience of DSpace related to the requirement for a completely separate identity. This seems to have been overcome by getting the librarian to do mediated deposit for you, but I don't have the feeling that the repository is well integrated into the institutional identity system. It should ...more »
Again from the RRS blog posts:
We don't think about identity management as part of the repository, although a really annoying early experience of DSpace related to the requirement for a completely separate identity. This seems to have been overcome by getting the librarian to do mediated deposit for you, but I don't have the feeling that the repository is well integrated into the institutional identity system. It should be, but I want more!
I may see the RRS as a special case of an Institutional Repository (IR), but many if not most research collaborations are cross-institutional. This means that if there is to be support for cross-institutional authoring, there has to be support for members of other institutions to log in to your RRS. And this has to be seamless and easy, ie done without having to acquire new identities.
In addition, Researcher Identity should provide name control, that is, it knows who you are and will fill in a standardised version of your name in appropriate places. It should know your affiliation (institution, department/school, group, project and/or possibly work package). It might know some default tags for your work (eg Chris is normally talking about "digital curation"). However, this naming support must extend beyond your institution, so that collaborators and co-authors can be first-class users of other features. And it should relate to your (and their) standard institutional username and credentials; nothing extra to remember. This implies (I think) something like Shibboleth support.
This is getting kind of complicated, and verging towards another complex realm of Current Research[er] Information Systems (CRIS, mentioned in other ideas). These worthy systems also aim to make your life easier by knowing all about you, and linking your identity and work together. But they are complex, have their own major projects and standards, and have been going for years without much impact that I can see, except in a few cases. The RRS should take account of EuroCRIS and CERIF (see Wikipedia page) as far as they might apply.
« less full details »
Managing data can be a big problem. Any data that might, for example, become supplementary data in an article, needs curating. Help the user by providing facilities to capture and hold intermediate versions of the data, ad the final public version.
Part of our work to examine the feasibility of approaches to improve the consistency with which repositories share the materials they hold (this idea), the metadata and descriptions of repository policies
Far from becoming irrelevant, metadata for repository items will
become more important but it will increasingly be created and assigned
remotely. This will be by automated procedures such as indexing and
text analysis and also by users and readers, through the use of
tagging mechanisms. These developments will have implications for
consistency between repositories and between items.
Most early Institutional Repositories were research repositories. Some are purely repositories housing digital objects as in "Repositories are "collections of digital objects"". However, since one of the primary aims is to showcase the intellectual assets of the institutions (as compared to providing Open Access to peer reviewed journal articles) another model was 'hybrid'. The use as a bibliography (suggested both by ...more »
Most early Institutional Repositories were research repositories. Some are purely repositories housing digital objects as in "Repositories are "collections of digital objects"". However, since one of the primary aims is to showcase the intellectual assets of the institutions (as compared to providing Open Access to peer reviewed journal articles) another model was 'hybrid'. The use as a bibliography (suggested both by previous practice and by senior academics) required the metadata to be deposited even if it was not possible to deposit the 'publication'. This is particularly important if you want to showcase well the whole institution, including the Humanities, where outputs are not so easily deposited eg a book or exhibition.
Therefore one model is 'hybrid' including both digital objects and their metadata and sometimes just metadata or metadata plus links to trusted repositories elsewhere. This latter aspect may become more important as the number of these trusted (eg funder) repositories grow. Of course, you can also make a subset of this repository which includes 'full text only' as in the alternative " digital object repository" model but this does not then give a full picture of the institution.
Hey, Jessie M.N., Simpson, Pauline and Carr, Leslie A. (2005) The TARDis Route Map to Open Access: developing an Institutional Repository Model. In, Dobreva, Milena and Engelen, Jan (eds.) ELPUB2005 From Author to Reader: Challenges for the Digital Content Chain: Proceedings of the 9th ICCC International Conference on Electronic Publishing, Katholieke Universiteit Leuven, Leuven-Heverlee, Belgium, 8-10 June 2005. Leuven, Belgium, Peeters Publishing, 179-182.
http://eprints.soton.ac.uk/16262/
« less full details »
Again, the Andy Powell idea. This one, I think, more about sharing, embedding, mashups. Think Flickr. Think sneep.
Inconsistency is a fact of life, and any repository instance or system that wants to avoid bottlenecks is going to have to accept items that have inconsistent metadata (and possibly inconsistent formats and policies, though consistency in those areas may be easier or more important to enforce). That doesn't mean you have to settle for it, though. It's possible to take a progressive approach, where messy metadata comes ...more »
Inconsistency is a fact of life, and any repository instance or system that wants to avoid bottlenecks is going to have to accept items that have inconsistent metadata (and possibly inconsistent formats and policies, though consistency in those areas may be easier or more important to enforce).
That doesn't mean you have to settle for it, though. It's possible to take a progressive approach, where messy metadata comes in, and is then brought into consistency (by humans or machines) with particular standards. Moreover, certain metadata formats (such as MODS) have ways of marking metadata that's informal first-cut versus metadata that's been brought into conformance with a standard, allowing people or programs to go through repositories and catalogs and find and prioritize items to be brought into conformance.
I did something like this with my online books collection, where I started with non-authority-controlled names and no subject terms, and over time brought more names under LC authority controlled, and also over time went from no subjects to automatic subject assignment with somewhat out of date authorities to reviewed subject assignment with current authorities.
The process isn't completely finished (though it's largely done by now), but it doesn't have to be complete to be useful. The metadata includes (implicitly or explicitly) indicators of the quality of name and subject metadata therein, and I can prioritize what to update based on usage and the influx of new and related material.
« less full details »
3) My repository aims for accessibility and/or usability of its contents for the long term (say greater than 10 years).
Social Web