Powered By IdeaScale - try it free Login/Signup   |    Subscribe 

Repositories - communicating the idea

 

Browse :

   Top Rated Ideas    |    Recent Posted Ideas

The purpose of this site is to gather ideas and opinions about how repositories are defined and consistency between them. The discussion will be used to prepare the outputs from the Repository Architecture meeting on the 3rd of July and the Repositories Road Map meeting on the 21st of July and a report on consistency.

Please comment or vote on ideas or post your own for others to comment and vote on.


6, Define repository as part of the user’s (author/researcher/learner) workflow 
It is important to take account of user's workflows when defining a repository so it is not considered a system that is removed from the users daily routine.
Tags : workflow users
Repositories are dead, long live repositories 
The current repository technology is library/cataloger centric: items are uploaded (usually by a cataloger, not the author), and most of the meta-data is added by a subject specialist. In this model, the author-as-depositor is (at best) just an initiator for a deposit process.

A better solution would be to move towards a Combined Research Information System [CRIS], where the academic can organise their areas of interest [AOI]; see the research grants they have (and associate them with their AOI); lodge keep-safe copies of work-in-progress, data-sets, talks, ideas for future work, posters, etc (and associate them with grants or AOIs).

From this corpus of data, the academic can indicate what is visible locally (within the research group/department/organisation) and what is available globablly... and from that "globally available" pool, an "Institutional Repository" can be assembled.

The big advantages of a system like this is that the user only needs to define the meta-data specific to that object (an AOI has a title and a description, and inherits a creator from the CRIS; an article has a title and an abstract, but also inherits data from the associated grant and/or AOIs) - this is a much smaller "keystroke" barrier (or whatever you call that "I don't want to enter lots of metadata" problem)
5, Focus on services to users enabled by digital collections in repositories 
(i.e. emphasise benefits)
Tags : users benefits services
Consistency between repositories is not an end in itself; it is only important if it is a requirement of real value-added servic 
Interoperability needs to be motivated by service requirements, not fetishized as an end it itself.
Comments (5)Posted By : dempseyl on 06/26/2008 under Consistency
Allow the user fine-grained disclosure/access control to repository objects 
If the repository is to become anything other than a final destination for public objects, then the user needs control over access. This control must be able to ALLOW access to the objects by colleagues, wherever they work, as well as prevent access by others.
Comments (3)Posted By : c.rusbridge on 07/14/2008 under Repository functions
Say what we mean: stop using the term repository 
When we use the term repository in the context of JISC(and other repository networks) essentially it means making content (in our case produced as part of research, learning and teaching) available over the network so it can be shared and used. But the word doesn’t say that. The word says store. We should be saying what we mean. We should really be talking about making content available on the web? And if concerned with preserving content talk about doing that etc. The term repository has almost become meaningless because so many uses and functions are bundled up together under that term.
Comments (12)Posted By : r.bruce on 07/17/2008 under Definiton rules
1, Definition assumptions 
Definition should not make assumptions as to implementation architecture i.e. whether deposited collection(s) held at institutional or network level
Tags : repository architecture institutional network level
Comments (2)Posted By : a.mcgregor on 06/20/2008 under Definiton rules
7. We shouldn't be thinking of repositories as a place. 
With acknowledgement for this idea to Owen Stephens' recent Tweet. My interpretation of this idea is that 'repositories' are best viewed as a 'type' of data store supporting a variety of services, embedded in various workflows. This fits nicely with Paul Walk's concept of a 'source repository' (see http://tiny.cc/FIHwc) being a simple system with complexity moved to specialised services. I suppose this approach isn't that far removed from the original OAI concepts of data provider and service provider, though the focus there was on access whereas now we are considering a wider context for repositories..
Tags : place source
Comments (0)Posted By : rachel.heery on 07/07/2008 under Broadening definition
There are feasible and worthwhile approaches which will improve the consistency with which repositories share metadata 
As part of our work to "examine the feasibility of approaches to improve the consistency with which repositories share material", we are looking at this in regard to 3 areas: metadata (this idea), the materials themselves and descriptions of repository policies (e.g. on IPR) [materials and policies appear as separate ideas].

Tags : repository jisc
Comments (2)Posted By : nf on 06/26/2008 under Consistency
The repository should be more like "part of the web" 
This is the Andy Powell worry; we have made the repository too much of a "special thing" operating under "library rules". Make it more like Slideshare. I'm going to express this another way...
Comments (0)Posted By : c.rusbridge on 07/14/2008 under Repository functions
The repository/library should provide support in the publishing process 
Another from the Research repository System (RRS) blog posts:

Publisher liaison is maybe controversial. But why shouldn’t the RRS staff (or your library) support you in dealing with publishers? The RRS wants your articles and your data, and should help you negotiate and reserve the rights so that they can get them. So publisher liaison would include rights negotiation, submission to the publisher on your behalf of a specific version, support through the editorial revision process, and recovery of metadata from the published version for the RRS records and your own bibliography, web page and CV. Naturally, deposit in the repository would be integrated in this workflow; you only have to authorise opening to the public, or perhaps a more restricted audience.
Comments (3)Posted By : c.rusbridge on 07/14/2008 under Repository functions
Make the repository work for the user, not the other way round 
I guess this is the workflow idea again, but stated another way. Don't get too hung up on "workflows", as in the e-science meaning (kepler, taverna et al). This is about making the repository fit in what people are trying to do, eg write the article, keep multiple versions, share with their colleagues in other institutions...
2, Different definitions are required for different audiences 
Repository does not mean much to a researcher but it has a very specific meaning to a librarian. Therefore we need to make sure that there are definitions that can be tailored to specific audiences to ensure that messages are understood.
Tags : audiences communication
Comments (3)Posted By : a.mcgregor on 06/20/2008 under Definiton rules
Help the user manage data 
Managing data can be a big problem. Any data that might, for example, become supplementary data in an article, needs curating. Help the user by providing facilities to capture and hold intermediate versions of the data, ad the final public version.
Repository is associated with a persistent storage system 
OK, I'll go the whole hog in relation to the RRS blog posts:

At a very basic level, the RRS should [be associated with] a Persistent Storage service. Completely agnostic as to objects, Persistent Storage would provide a personal, or group-oriented (ie within the institution) or project-oriented (ie beyond the institution) storage service that is properly backed up. There’s no claim that Persistent Storage would last for ever, but it must last beyond the next power spike, virus infection or laptop loss! It has to be easy to use, as simple as mounting a virtual drive (but has to work equally easily for researchers using all 3 common OS environments). Conversely (and this isn’t easy), there must be reliable ways of taking parts of it with you when away from base, so synchronisation with laptops or remote computers is essential. It should support anything: data, documents, ancillary objects, databases, whatever you need. It’s possible that “cloud computing” eg Amazon S3, the Carmen Cloud or other GRID services might be appropriate.
Comments (3)Posted By : c.rusbridge on 07/14/2008 under Repository functions
The repository should be meshed into a more sophisticated system of researcher identity management 
Again from the RRS blog posts:

We don't think about identity management as part of the repository, although a really annoying early experience of DSpace related to the requirement for a completely separate identity. This seems to have been overcome by getting the librarian to do mediated deposit for you, but I don't have the feeling that the repository is well integrated into the institutional identity system. It should be, but I want more!

I may see the RRS as a special case of an Institutional Repository (IR), but many if not most research collaborations are cross-institutional. This means that if there is to be support for cross-institutional authoring, there has to be support for members of other institutions to log in to your RRS. And this has to be seamless and easy, ie done without having to acquire new identities.

In addition, Researcher Identity should provide name control, that is, it knows who you are and will fill in a standardised version of your name in appropriate places. It should know your affiliation (institution, department/school, group, project and/or possibly work package). It might know some default tags for your work (eg Chris is normally talking about "digital curation"). However, this naming support must extend beyond your institution, so that collaborators and co-authors can be first-class users of other features. And it should relate to your (and their) standard institutional username and credentials; nothing extra to remember. This implies (I think) something like Shibboleth support.

This is getting kind of complicated, and verging towards another complex realm of Current Research[er] Information Systems (CRIS, mentioned in other ideas). These worthy systems also aim to make your life easier by knowing all about you, and linking your identity and work together. But they are complex, have their own major projects and standards, and have been going for years without much impact that I can see, except in a few cases. The RRS should take account of EuroCRIS and CERIF (see Wikipedia page) as far as they might apply.
Comments (0)Posted By : c.rusbridge on 07/14/2008 under Repository functions
Recognise the differences in services for preservation and services for sharing 
The umbrella term "repository" conflates two very different kinds of services - services whose primary purpose is to preserve a type of media, and services whose primary purpose is to enable media to be shared and used by people. They don't look the same, they have different kinds of users and roles, they don't share the same concerns, and you use different language to talk about their features. Maybe we would get further by having an amicable divorce, and only get together to talk about things that are completely generic, like storage.
Repository should aspire to make contents accessible and usable over the medium term 
A repository should be for content which is required and expected to be useful over a significant period. It may host more transient content, but by and large the point of a repository is persistence. While suggesting a repository should be a "full OAIS" has not proved acceptable to this group so far, investment in a repository and this need for persistence suggest that repository managers should aim to make their content both accessible and usable over the medium (rather than short) term. For the purposes of this exercise, let's suggest factors of around 3: short term 3 years, medium term around 10 years, long term around 30 years plus. Ten years is a reasonable period to aspire to; it justifies investment, but is unlikely to cover too many major content migrations.

To achieve this, I think repository management should assess their repository and its policies. Using OAIS at a high level as a yard stick would be appropriate. Full compliance would not be required, but thought to each major concept and element would be good practice.

This "idea" is to replace the "full OAIS" approach with something more realistic and achievable.
Comments (0)Posted By : c.rusbridge on 07/22/2008 under Broadening definition
The repository should provide authoring support 
This is a refinement of the current top-rated idea, based on one of my blog posts on research repository systems.

Authoring support should include version control, collaboration, possibly publisher liaison, and be integrated with the repository deposit process. It does need object disclosure control, see below. Version control would support ideas, working drafts, pre-prints, working papers, submitted drafts undergoing editorial changes, and refereed and published versions. Collaboration support would need to include support for multiple authors contributing document parts, and assembly of these into larger parts and eventually “complete” drafts. It should also include some kind of multiple author checkout system for updates, something like CVS or SVN, maybe a bit WIKI-like. It must support a wide choice of document editor, eg Word, OpenOffice.org, LaTeX etc (I don’t know how to combine this with the previous requirement!).
Comments (4)Posted By : c.rusbridge on 07/14/2008 under Repository functions
There are feasible and worthwhile approaches to improve the consistency with which repositories share the materials they hold 
Part of our work to examine the feasibility of approaches to improve the consistency with which repositories share the materials they hold (this idea), the metadata and descriptions of repository policies
Tags : repository jisc
Comments (0)Posted By : nf on 06/26/2008 under Consistency
The repository should have more "web 2.0" features 
Again, the Andy Powell idea. This one, I think, more about sharing, embedding, mashups. Think Flickr. Think sneep.
Comments (0)Posted By : c.rusbridge on 07/14/2008 under Repository functions
Institutional research repositories are based on different models - not only solely a 'digital object' repository 
Most early Institutional Repositories were research repositories. Some are purely repositories housing digital objects as in "Repositories are "collections of digital objects"". However, since one of the primary aims is to showcase the intellectual assets of the institutions (as compared to providing Open Access to peer reviewed journal articles) another model was 'hybrid'. The use as a bibliography (suggested both by previous practice and by senior academics) required the metadata to be deposited even if it was not possible to deposit the 'publication'. This is particularly important if you want to showcase well the whole institution, including the Humanities, where outputs are not so easily deposited eg a book or exhibition.
Therefore one model is 'hybrid' including both digital objects and their metadata and sometimes just metadata or metadata plus links to trusted repositories elsewhere. This latter aspect may become more important as the number of these trusted (eg funder) repositories grow. Of course, you can also make a subset of this repository which includes 'full text only' as in the alternative " digital object repository" model but this does not then give a full picture of the institution.

Hey, Jessie M.N., Simpson, Pauline and Carr, Leslie A. (2005) The TARDis Route Map to Open Access: developing an Institutional Repository Model. In, Dobreva, Milena and Engelen, Jan (eds.) ELPUB2005 From Author to Reader: Challenges for the Digital Content Chain: Proceedings of the 9th ICCC International Conference on Electronic Publishing, Katholieke Universiteit Leuven, Leuven-Heverlee, Belgium, 8-10 June 2005. Leuven, Belgium, Peeters Publishing, 179-182.
http://eprints.soton.ac.uk/16262/
Tags : institutional research repository hybrid model
Comments (0)Posted By : jesshey on 07/21/2008 under Broadening definition
Broad principles not tight prescriptions 
The changes in technology, the diversity of cataloguing practice,
the diversity of ownership and legal considerations and the
possibilities for metadata to be created remotely all mean that
acceptable and achievable recommendations for consistency between
repositories are likely to be broad principles with examples of good
practice rather than prescriptive rules or precise recommendations.
Tags : recommendations schema policies metadata
Comments (0)Posted By : nf on 08/16/2008 under Consistency
Service creators will be looking for human-readable specifications 
People who might create services from repository-based information
will be looking for simple human-readable information on the policies,
formats and metadata used by repositories. This is as important as
creating machine-readable interfaces.
Tags : human readable machine readable repository services
Comments (0)Posted By : nf on 08/16/2008 under Consistency
Lets think outside the box.... 
This is focused on the researcher world, but the arguments hold for other fields

Q: What is the primary factor for ranking researchers?
A: Citations.
Surely the aim, therefor, of the researcher is to market her work as widely as possible, to maximise the potential for citation.
Given that we are now in the Information Age, where The Internet is the primary source of answers (backed up by reading what has been found, on paper), then the sensible solution is to place enough of the research results on the Internet such that they can be found and assesed, and followed up.
Where, in the Internet, this material is placed is almost moot: the Internet has no location per sae - Search Engine index everything, everywhere.

Q: What is the primary factor for ranking Institutions?
A: The amount of research performed by researchers of standing (see above)
Surely the aim, therefor, of the Institution is to market the work of their researchers, with sufficient "corporate identity" attached, as widely as possible, to maximise the readership of that work.

THEREFORE
I think we can say that researchers need publicity, and Institutions want to be the ones to do it.

The question I see is:
"How can we make it easist for the researcher to publicise their work, and how can we help the Institution capitalise on that individual publicity?"

"Institutional Repositories" are the current solution - are they the right one?
Comments (0)Posted By : ian.stuart on 08/15/2008 under Miscellaneous
The Repository is about re-engineering institutional business processes 

The concept of the repository is difficult to distinguish from other kinds of institutional services which might be offered (digital archiving for example), unless the original context of the idea is considered, which was (and is) the scholarly communications crisis.

Within the context of the scholarly communications crisis, the original purpose of repositories was to provide a way in which universities could enable access to research which they could no longer afford to buy because of the rising costs of journals. The repository was therefore a tool to aid the re-engineering of the business process of the library (as we came to think), or even the business processes of the entire university (as we now
tend to think).

My own view is that if the repository isn't serving that function, it isn't
a very useful concept.
Tags : repository definition re-engineering business process scholarly communication crisis
Comments (0)Posted By : philip.hunter on 08/15/2008 under Repository functions
Metadata will increasingly be created remotely at the point of need 
Far from becoming irrelevant, metadata for repository items will
become more important but it will increasingly be created and assigned
remotely. This will be by automated procedures such as indexing and
text analysis and also by users and readers, through the use of
tagging mechanisms. These developments will have implications for
consistency between repositories and between items.
Tags : metadata tagging auto classification indexing
Comments (0)Posted By : nf on 08/16/2008 under Consistency
4, Definition should encompass likely evolution in software solutions 
Examples include content management systems, virtual research environments, CRIS etc
Tags : software future
Comments (0)Posted By : a.mcgregor on 06/20/2008 under Broadening definition
There are feasible and worthwhile approaches to improve the consistency with which repositories share their policies 
Part of our work to examine the feasibility of approaches to improve the consistency with which repositories share descriptions of repository policies (e.g. on IPR) - this idea -, metadata and the materials themselves.
Tags : repository jisc
Comments (2)Posted By : nf on 06/26/2008 under Consistency
Form follows function in defining "global knowledge waiting rooms" 
Humans have never before been called on to save so much stuff in whatever we name these digital containers. Historically we have been compelled by circumstance to let things go, albeit often unwillingly. The list of what we leave behind can include almost everything we care about--books, photographs, hard drives, memorabilia and artworks--to even bigger items such as houses and cars. Whether selling, donating, recycling, sharing or being forced to abandon our stuff due to unforeseen circumstances, we are more or less wired to cope with the dynamic lifecycle of digital and analog “knowledge objects” as they intersect with our lives.

As I hold out hope that I will find time to organize, annotate and share my digital photo archive, others look to keeping ideas--digital text and rich media--around long enough for academic and public review that holds the promise of transformation into vetted knowledge. The timetable for when a paper, dataset or video will become useful, or perhaps even critical, is often unknown. As the cost of storing digital stuff has gone down, we seem less willing then ever before to let things go. The conundrum of coping with the biggest information deluge in human history, coupled with cheap storage, and an unknown timetable for usage seems to equal a disruption in our collective ability to merge and purge our stuff. Terminology discussions for what to call (packed) global knowledge waiting rooms seem to be to be a by-product. We can now afford to rent an endless number of mini-storage units, but will never have time to arrange or make use of their contents.

Les Carr pointed out during OR08 in Southampton earlier this year that collecting and curating over time is what a persistent and permanent repository backed by policies and institutional commitment implies. A repository is not intended to be a fly-by-night dumping ground. About ten years ago the terms "digital library" seemed to be a way to give small or large, and sometimes poorly organized, collections of academically-created web pages a certain gravitas that would promote preservation. A "portal" to resources is also a term that has been used to imply "more than a mere web site." Terminology that is meant to denote REALLY IMPORTANT STUFF has been around for a while. What has been missing in finding the right name is a view towards specific functionality that might contribute to a knowledge workflow on top of resources to make use of really important stuff.

In his keynote address at JCDL 2008 Alex Szalay explained that there is a science project pyramid that builds on a single lab at the base, a multi-campus project in the center, and international consortia on top as scientific disciplines recognize the need for major initiatives that are highly collaborative and distributed. He suggested that the output from these efforts at every scale contain:

–Literature

–Derived and re-combined data

–Raw data

Szalay would like to see a continuous feedback loop among these three aspects where data and analysis are always updating. In my view the active form that Szalay outlined should be encompassed by a term that implies the inherent function of a semantically-enabled analysis loop in a dynamic "knowledge waiting room."

Comments (0)Posted By : clt6 on 07/21/2008 under Broadening definition
Minimal metadata for sharing? 
For all practical purposes, the ability to express metadata as the
Dublin Core metadata elements is a sufficient baseline for sharing
repository items across subject and institutional domains.
Tags : metadata dublin core sharing
Comments (0)Posted By : nf on 08/16/2008 under Consistency
nouns are for numpties 
Should we regard Repository as yet another noun of uncertain parentage that is searching for meaning. R is for Repository in the same way that P was for Portal, O was for Ontology and M was for Metadata. N? Nothing just yet, or maybe N is for Network. Fortunately, the other nouns have had some prior usage. Repository has less common usage except as a place to store furniture.

Personally I've always preferred verbs, as these almost automatically put the focus on actions (or states of being), on tasks of actors, as essential links in the subject/verb/object triple.

It is not self-evident to me whether Repository is a new label to describe something(s) that have existed for a while or something(s) that have come into existence to justify the term. In the early 1980s I worked with the Scottish Education Data Archive which was a collection of survey datasets (on related topics, and generated by a research centre in a university over time), then in the mid to late 1980s and 1990s I worked with Edinburgh University Data Library which was both a collection of user-contributed datasets and a collection of third-party published datasets. In both instances the purpose of the 'data archive' and 'data library' was to provide access to those datasets to them that wants them. For both, there was some forward thinking, in that we collected and curated ahead of demand: for example we took in user-contributed digitised boundaries of a particular geography before we knew anyone would re-use them. In the late 1990s and since I have worked on a variety of online services which depend upon the management of databases of data objects, datasets and datastreams that others (not me) have created, although these are mostly not 'user- or community-generated'. We did not call them repositories at the time - or not until quite recently.

Jorum is a national repository of learning materials, devised and developed by staff at EDINA and Mimas in response to expressed requirements to keep stuff safe, and to enable and facilitate sharing. What makes it a repository? Its a database that we call a repository. Why? Because that was the term of the moment and was and is understood within a certain 'designated community', but not much beyond. When thinking about Jorum, about the repository built for GRADE, and for the store of datasets used for eMapScholar, and then for the Depot (in the Prospero project) we thought that Cliff Lynch's statement that "a university-based institutional repository is a set of services" needed re-phrasing: a repository is a managed database that supported three (or more) services, necessarily including deposit (ingest), keep-safe, access (download). But any decently managed database does that surely?

M2M access, by API (and OAI-PMH) has been put up as a necessary characteristic of a repository, but that m2m access has been commonplace for many services from EDINA and Mimas, and again is that not just what we would want from any managed network-accessible database?

Digimap is built upon a range of databases, some populated by data from the Ordnance Survey, some by derived data (value added, curated by EDINA) and now also some contributed by users.

I confess I am at a loss to understand what is distinctive about a repository. Except perhaps, that the attention should focus on the quality and nature of the service that is delivered to the (potential) depositor. Understanding why someone wants to deposit (share) something, and what would constitute reward (in terms of happiness not just lack of pain) for the act of depositing is hard, elusive and novel. We are examining how to make the Depot into a service for happy putting, so too with Jorum. The motives for sharing differ, as does the nature of the workflow during which 'deposit' could be considered natural.

Now B is for Bucket: must it hold objects as well as liquid, must it provide means by which things can be poured into it, as well as out? Is there a hole in the bucket, does it have to have a handle, what if there was a spout?

Comments (0)Posted By : p.burnhill on 07/19/2008 under Repository functions
A series of 20 key interviews to assess the feasibility of approaches to improve consistency 
In particular we want to ask these key interviewees the questions to which you would like to hear the answers. So if you have an interesting or useful question (or more than one), particularly concerning the creation of user-facing services using repository content, then please use the comment facility to suggest it. Or even better put it here as a new idea (go to the home page and choose New Idea, then choose category "consistency") and then others can comment on it.

And we need to know who to ask. Who will be the most useful people to get these answers from - whose views would you be intrigued to hear on this topic? Again use the comments facility, or better still put a name here as a new idea (go to the home page and choose New Idea, then choose category "consistency") and then others can comment on it.
Tags : repository jisc
Comments (1)Posted By : nf on 06/26/2008 under Consistency
We should embrace inconsistency 
We cannot achieve consistency, so if it is important then we are doomed to failure. Why can't we achieve consistency?

There are (say) 200 universities in the UK, and perhaps 20,000 worldwide, then there are subject repositories, project repositories, library and archive repositories and commercial repositories (which may be free or charged for or a mixture).

There are data repositories, image repositories, paper repositories etc.

All these repositories are set up for particular reasons and will want to achieve different things. What the BBC wants people to do with their's is very different to say NICE or the University of Wigan. They will, inevitably, have different collection policies, different ideas on appropriate metadata standards, different methods of accessing them (an image repository or data repository will require different affordances to a text repository).

To expect any form of consistency - of language, of policy, of metadata, of standards even of legal scope will simply not work.

Indeed, I would suggest that to achieve consistency we would require working in a closed community, and even then it would probably not work.

The alternative is to embrace inconsistency and work with that.
Tags : consistency repository
Comments (2)Posted By : tom on 07/02/2008 under Consistency
The repository should be a full OAIS preservation system 
We should at least have this on the table. I think repositories are good for preservation, but the question here is whether they should go much further than they currently do in attempting to invest now to combat the effects of later technology and designated community knowledge base change...
Comments (0)Posted By : c.rusbridge on 07/14/2008 under Repository functions
Privacy Policy   |   Terms of Use
Idea Management