Dec 202011

Its a bit early to be making predictions about how IGIBS might evolve, but a recent presentation to the EDINA geoteam followed by some discussion indicated some of the possibilities.

  • The WMS Factory Tool.  With the simple but effective styling capability that Michael Koutroumpas engineered, I think we have a prototype thats not too far off a production strength tool.  There are loads of scenarios where its valuable to have access to a tool that makes it easy to see your “non-interoperable” data alongside the growing number of INSPIRE View Services (read WMS) from public authorities across Europe going online.  So top of my list is improving this tools styling capability.
  • Associated with this would be better understanding of necessary data publication infrastructure, eg, making it easy to use the other OGC Web Services.  Something like the GEOSS Service Factory ideas emerging from the EuroGEOSS project.  I think there is a real demand for tools to make it easy to use the OGC standards.
  • In the immediate future, I think its likely that the IGIBS team will do some promotion of the project outputs, eg:
    • presenting the project at relevant events, eg, Association GI Laboratories Europe conference, OGC Technical Committee meetings.  This might cost as little as £500 depending on where the event is.
    • use of social media to promote both the WMS Factory Tool and the report on “Best Practice Interaction with the UK Academic Spatial Data Infrastructure”.  This too could cost as little as an additional £500.
  • The latter report is worthy of a lot more investment.  A major output from this project, possibly the single most important output, is the increase in use of UK academic SDI services within the Institute of Geography and Earth Science (IGES) at Aberystwyth University.  IGES is acting as an exemplar for best practice research data management around geospatial data, the department is actively building on the IGIBS work and it will be interesting to see how it develops and if other departments in other institutions see the benefit and start to emulate what Aberystwyth is doing.  More work promoting Steve Walsh’s report would help.
Oct 132011

Now available, the registration page for the GECO/IGIBS event on Friday 11th Nov, 2011 from 1115 to 1500 GMT at the Welsh Government Buildings, Cathays Park, Cardiff.

Full details can be found here

We have a good mix of speakers from the academic, public and private sectors, and should get some good discussion.  I think it will be especially interesting to get some insight into the developing plans for how the devolved government of Wales is rolling out INSPIRE.

From the IGIBS perspective, this is us effectively delivering the first demonstration of UK access management technology being used to secure public sector services in combination with academic sector services as per the project plan

Oct 102011

After some fantastic help from James Reid at EDINA we thought to put together a blog post summarising some of the conclusions we have come to over INSPIRE.

At this stage it may be  worth having a look at my earlier but less informed post regarding INSPIRE to understand how my understanding of the issues has progressed.

For INSPIRE to be something that universities need to spend time and money complying with, then there are several questions needing an answer. We are not in a position to answer all of them with 100% certainty but with the help of James here are some conclusions we have come to.

1. Are universities  “public bodies” or more accurately public authorities? This appears to be one area that the fog has lifted from. The INSPIRE Regulations will only apply to public authorities and James has taken the trouble to check out this area with Edinburgh  and is certain that universities are public authorities for the purposes of INSPIRE.  So one “Yes” to INSPIRE

2. Do universities hold and control datasets that match the data described in any of the INSPIRE data Annexes? After looking through the datasets collected for the IBIBS project I have found 11 (or about 5% of them) that match up with some of the data themes in Annex iii. The IGIBS data is probably not  representative  of the total extent of data held by Aberystwyth University and a data inventory of data held by some of  the academic staff would be needed to quantify the amount of INSPIRE data held.  So another “Yes” to INSPIRE

3. What is the public task of a university?  Here is where the situation becomes less clear.  There appears to be no public task defined for universities. The problem seems to stem from the fact the universities are not covered by the PSI Regulations and therefore have not needed to define a public task for themselves.  Again James has made some progress on this and pointed to a publication from the National Archives that helps explain the process 0f defining a body’s public task.  There has also been some slightly ambiguous advice from the Scottish Information Commissioner that includes a suggestion that it may be relevant for a university to seek legal advice over the  issue. So there seems to be no clear answer to this question. A case of  “we dont know yet”

4. Do those data identified in 2 above relate to the public task of the university? Again until we know the answer to 3 above  we can only guess at the answer to this question. Commonsense suggests that research and teaching must be part of the task if it is ever defined. So my guess would be a “probable Yes”

5. Will there be any attempt to enforce the regulations? Again no way of knowing the answer to this and it may even involve some judicial intervention to clarify the situation. Strictly speaking if Universities are public authorities for the purposes of the INSPIRE Regulations then they are already not complying with INSPIRE as they have not established a complaints procedure to deal with questions over INSPIRE data provision as required by the Regulations. So currently a “NO” but with the uncertainty surrounding public task it could be a complicated or impossible job to enforce this regulation at present. So this will have to be a wait and see area.

Sep 052011

I was fortunate enough to have a meeting with some people from EDINA and the DCC in Edinburgh on Wednesday. The aim of the meeting was to get some input and advice from some experts on the ideas I have for a spatial data management best practice report.  So a big  thank you to Martin Donnelly of the Digital Curation Centre (DCC), James Reid, Stuart McDonald, Chris Higgins and Michael Koutroumpas from EDINA.

I had a long 7 hour train journey from Aberystwyth so my apologies for the overdose of PowerPoint slides that I had time to create before the meeting. It was extremely helpful to talk to experienced and knowledgeable  people about the direction the report, which is one of our outputs from the IGIBS project. My background in environmental science leaves a few significant gaps in my knowledge and, as Chris put it, “a sanity check” on my work was well worth the time needed to attend the meeting. I even had the opportunity for an evening walk on Arthur’s Seat and a lunch hour looking around Edinburgh as a bonus.

Some of the key advice from the meeting centered around the following; INSPIRE and how it will or wont impact on Universities,  insights into the not so obvious but very significant benefits of writing a data management plan and where it fits into good data management, some great pointers to other studies and sources of information that will feed into the report, the need to make the report easily accessible to its audience and some great institutional  case study examples from Australian through Californian to British Universities.

Another theme that emerged from the discussion was how INSPIRE and the need for good data management can be viewed as a threat but it is also a great opportunity for academic staff to gain easier access to the ever increasing amounts of spatial data being created around the Globe. A viewpoint that will help to make the report more appealing to time starved researchers.

We also had talk of semantics and just what do you call a spatial data infrastructure (if you don’t want to use SDI). It was suggested that UK Location has moved towards Location Information Infrastructure as a way of making an SDI label more intelligible to the uninitiated. I found this much more enlightening and useful that the recent update from UK Location on “Data Things” and abstracted “Data Objects”  but a few hours of digestion may make this a little more understandable to my irretrievably ecologically orientated mind.  It reminded me of some reading I had done about old Norse governance and how their aassembly was called the “Thing” and met in the “Thingstead”.  I remember thinking that they didn’t have a proper word for it so just called it the “Thing” but I guess that just shows how language develops over time and maybe we can look back to SDI in a few years with the benefit of a really useful label for it, whatever that may be.

As a result of the meeting I am re writing some sections I had drafted and adding some new summary sheets for subsections of the intended audience and more importantly I don’t feel like my original thinking was miles off the mark, just a bit  under-informed and lacking some focus.  So creating the rest of the report will also be made a little easier once I have digested the new material I have been pointed towards.

So thank you once more gentlemen and I look forward to meeting you again if the occasion arises.


Aug 082011

My apologies to the Duke of Wellington for mutating his often quoted call to a jilted mistress about his intimate letters, but the sentiments in the original statement do suggest the power of the publication process to give information a life of its own. 

If geospatial data were published in a similar way to research findings (or even to letters from the rich and famous to their mistresses) then data management and the academic spatial data infrastructure (SDI) would be an even more  rapidly developing entity, that had the commitment of every academic researcher that generates such data.  OK so a sweeping unrealistic statement but this is how I came to the thought…. 

One of the pillars of modern day science is the peer review process. It takes a piece of original research and, through the  publication process, many stages of refinement are applied to it until the researcher is satisfied that it stands a chance of acceptance by an appropriate journal. During this process it will be proof read, checked for errors and formatted in the appropriate way. Then after input from independent referees it will be further improved and finally, if judged acceptable, the research is published where it can be accessed, seen and discussed by the wider scientific community (or any community that wishes). 

After publication it is archived and catalogued so that it can be found on-line or in hard copy and can be used and quoted by anybody who wishes. There may be a network of people and libraries that will have subscriptions to the journals and they will see the newly published articles appear on their shelves, desks or screens every few months. Finally if the research is worthy it may be used as a component of more research and go on to help develop the knowledge base. 

Concomitant with this process is the recognition given to the researcher and to the associated Institution that can result in promotion for the former and extra funding for the latter. This rather idealised description of the peer review process is something that the majority of academic staff and postgraduate students are fully engaged with and committed to. You might have to forgive my simplistic view but all I wish to establish is the principle that the publication process and the recognition it attracts drives the quality, accessibility and reuse of research findings. 

Now let’s consider how geospatial data is managed. It’s not so easy to simplify the process as there will be much greater variation. Some important data will be lodged with data centres where it may have a guaranteed 10 year life span (if its lucky) or maybe the metadata will be put in a discoverable place with a series of hurdles to cross before anybody can get access to the data itself. Quite often the data will never leave the IT systems of the Institution that it was created in; rather it will take second place to the research publications and may not be made accessible at all. This has been shown to be through worries over intellectual property, through a lack of awareness of a suitable data management and publication methods and a lack of recognition for such activities.  One thing is for sure, that most data will not have the same exposure to the science community as the research findings it supported. 

The long term life expectancy of such data is also likely to be shorter than its wordy cousin the research paper.  In fact it may not even exist after its collector has moved posts or retired or suffered a serious IT problem. It is very unlikely to be archived as well as the printed word and its creator is much less likely to have received credit for collecting it and the institution she/he works for is unlikely to receive improved research ratings or extra funding for generating it. 

Now imagine a word where data is King (or at least Queen alongside the research paper King) and research funding and University Chairs are partially reliant on the proper publication of peer reviewed data sets. I think that a fully functional academic SDI with all the bells and whistles that you could want would be a reality within the next decade. In the same way that JANET has, and continues to develop in the UK with its fast speeds and links to other country networks so would the SDI. 

Obviously this isn’t an original idea, there are a few journals dedicated only to data publication and there are strong policy statements all the way from Government through the publicly funded research councils and even to a few Universities that make the publication and accessibility of data a priority. What is missing are real incentives for researchers to treat data in the same way as research findings and until this gulf is filled then data will be the poor relation of the academic publication World. If the translation of the INSPIRE (see my previous post)   directive into European Governments’ actions includes Universities then maybe it will provide some significant infill for this gulf and move the discussion from “why should I?” to “how do I?”


Aug 012011

I have been reading and  thinking about the relationship between long term spatial data preservation and the short term needs of day-to-day data security during the life of a research project. With research data being generated at faster and faster rates and the life cycles of supporting technologies getting shorter data preservation is destined to be a continual problem requiring new and smarter solutions every few years. Just dealing with new data that can be produced by Earth observation satellites at the volume of terabytes per day and may exist in several formats as is passes through complex processing stages is enough to take the issue  well into the scope of being a serious problem.  The sentiment expressed by Moss points to the aspiration of researchers for their hard won data.

“Scientists now want to keep everything, which they assume is digitally possible, in the belief that everything has value and can be retrieved and repurposed.”  Michael Moss 2008

The question is; will the technology and the resources exist to meet this aspiration?

It is very easy for a researcher in a Higher Educational institution to secure data in the short term via either their own arrangements or by using the services of a central IT department. In that way data can be backed-up to multiple locations and held on hardware that is up-to-date and covered by manufacturer’s warranties. I am sure this professional approach is found in most (or all) HE institutions. It’s still up to individuals to avail themselves of these services but there are few obstacles standing in the way. Even storage costs are falling and for a few £s per gigabyte a university department can store data in professionally managed institutional servers.  

In the hierarchy of data preservation the next levels up become harder for a researcher to arrange.  Consider keeping spatial data for third parties to discover and use for the next 5 years. Immediately there is the need for precise and comprehensive metadata. This has been addressed in several ways and the development of specific standards via UK AGMAP has given, anybody who looks, an easy lead into useful metadata creation. For this longer term data storage it may also be necessary to look outside of your home institution to ensure suitable data curation and discoverability.  But where do you put the data? It needs an accessible location that links metadata and the data object and makes them discoverable by future researchers.  The tools provided by GoGeo provide a solution. Data can be described and even lodged within this service so it becomes searchable and accessible to other researchers. 

So there is an infrastructure for this stage of the data management process but it now needs the data producer to step outside of their daily routine and to work on tasks not always considered core for a busy academic looking to their next paper.  This 5 year time horizon is also significant in that the European INSPIRE Directive will be in force   for its Annex III type data by2013. This means that many university generated geospatial data sets will need to comply with the INSPIRE standards promoting interoperability across boundaries. Possibly more difficult to achieve will be dealing with older data which will also have to meet INSPIRE standards by the next decade.

Once we look beyond the next few years and start to focus on spatial data of high quality or significance things get really interesting and much more challenging.   It’s very easy to talk of data archiving and curation as if there are standard easily accessed facilities in every library. The more I have read the more I realised that it’s a far more fluid and developing science than I appreciated. 

Who decides which data are in need of professional curation, or which data can we afford to curate?  These kinds of questions move the process beyond the researcher into the realms of professional librarians or data curators and government departments working to budgets and polices.  All this leads to a further stream of questions: Can data be given to one institution to look after? Can we guarantee that any institution will be a permanent fixture?  Will the metadata that was created during data collection still have sufficient context to be useful in 10, 50 or 100 years time? How will the increasing number of data objects be kept searchable and accessible?  With hardware life cycles only being a few years, who will ensure the passing on of data to the next technology and will that technology still support the data format? These questions just start to scratch the surface of the issues involved in designing future data curation methods and policies.

Let’s hope that the situation described in the quote below won’t be applied to the early 21st century when looking back in 20 years time.

“In terms of preserving our digital cartographic heritage, the last quarter of the 20th century has some similarities to the dark ages. In many cases, only fragments or written descriptions of the digital maps exist. In other cases, the original data have disappeared or can no longer be accessed due to changes in technical procedures and tools.”Markus Jobst 2010

It’s possible to take this timeline one stage further and start to consider which spatial data sets, that are so important to major scientific discoveries or advancements, should be considered for preservation in  the equivalent of a scientific museum that holds the essential heritage of our scientific community.

Now where did I save that first human genome I was given for safe keeping in 2003? Are well not to worry it wasent geospatial data anyway, well not unless the DNA doner had an adress?! Oh and it was a mapping project so I better fnd it…….

Jun 042011

One of the many goals of IGIBS is to generate Web Map Services that will be used in conjunction with INSPIRE type View Services which themselves are compliant with the INSPIRE Technical Guidance for View Services version 3.0. To that end, it made sense to take the following basic INSPIRE criteria into consideration when making our choice of tools:

  1. Support for the LANGUAGE request parameter in a GetCapabilities Request.
  2. Support for “extended attributes” including elements extending the
    _ExtendedCapabilities substitution group of the WMS 1.3.0 schema with a custom
  3. Support for the optional WMS 1.3.0 parameters wms:identifier, wms:AuthorityUrl and wms:LayerLimit

Up till ~3 weeks ago (May 12th) no stable release of either geoserver or mapserver satisfied any of the above criteria.

Mapserver Customisation

In order to make IGIBS services INSPIRE compliant we are using a customised version of mapserver 5.6.6. The customizations involve backporting selected features from the development tree of version 6.0 plus our own additions to add support for the LANGUAGE parameter and the extended attributes in the GetCapabilites response. The code is available for perusal here for any interested parties. It comprises a patch against mapserver 5.6.6 plus a sample mapscript wrapper that can be run as a cgi to provide an INSPIRE compliant View Service. Since Mapserver 6.0 the patch should no longer be necessary, but the mapscript wrapper is still required.

Latest Developments

On May 12, 2011 mapserver released version 6.0 and geoserver released version 2.1.0. As part of that release, Geoserver got funding from the Ordnance Survey to add support for the aforementioned INSPIRE spec as a plugin and can now satisfy all of the above criteria, while mapserver only got support for the wms:Layerlimit attribute.


The choice of software depends on one’s requirements. For a national mapping agency seeking INSPIRE compliance it seems that geoserver 2.1.0 is currently the best route. For the purpose of IGIBS, we will stick to the modified mapserver 5.6.6 for the following reasons:

  • Speed. Mapserver has performed considerably faster in our tests involving rendering and reprojection of geospatial data, which is crucial for the dynamically generated services of IGIBS.
  • Flexibility. Mapserver can be very easily scripted in a high level language for prototyping and experimentation.
  • Tried and trusted modifications to ensure compatibility while still being flexible enough to follow the fluid INSPIRE specs.
  • Geoserver does not yet fully support all parts of the INSPIRE TG e.g. the  “scenario 2” mentioned in the standard.

Please feel free to submit any comments.


Jun 032011

The overall aim of the IGIBS project is to try and improve the relationship between the UK’s National Spatial Data Infrastructure (SDI) as manifested through the UK Location Programme (UKLP) and the UK’s academic SDI.

Our main objective is to focus on use cases emerging from research and education related to a particular area – the UNESCO designated Dyfi Biosphere Reserve.  Once articulated, these user requirements will drive the creation of two pieces of software of wider applicability and assist Aberystwyth University in developing resources for use by local students.

We are building on much prior art, especially in the area of Access Control.  EDINA runs the UK Access Management Federation (UKAMF) and, while it might not be fashionable, the reality is that many SDI resources, eg, data and web services, are going to stay protected.  This is true both of INSPIRE at the European scale and the UKLP nationally.  We aim to show how Shibboleth (the open source software that underpins the UKAMF) can be used to enable a wider range of use cases, so that UK students can get access to both open and protected resources, eg, from UK public authorities like Welsh Government.

We expect that the main four products resulting from this project will be:

  1. Working prototype of a “WMS factory” tool
  2. Simple mapping application
  3. Best Practice model for using UK academic SDI at the departmental level
  4. Demonstration of UK access management technology being used to secure public sector services in combination with academic sector services

SDI is underpinned by open geospatial standards like the OGC’s Web Map Service (WMS).  The “WMS factory” tool will allow users to upload their data and instantiate a WMS so that their data can then be viewed online, via a simple mapping application, in conjunction with reference data from Welsh Government.

Shibboleth is already used in academia, we extend its use here to demonstrate how public sector data can be made securely available to authenticated and authorised users within academia.

The Institute of Geography and Earth Sciences (IGES) has ambitions to improve the way it educates students in the use of open geospatial interoperability standards and intends using the Dyfi Biosphere Reserve area as an exemplar.  To this end we are conducting an inventory of data for the area and creating a repository for educational use.  The “Best Practice model for using UK academic SDI at the departmental level” will feed into this activity as well as provide guidance for the wider university sector.

Apr 182011

Information is now being collated on available data sets to incorporate in this project. We have identified a number of case study users from the Institute of Geography and Earth Sciences (IGES), Aberystwyth University and Forest Research in Wales, Forestry Commission who have previously and are currently working on projects based in the Dyfi Biosphere.

As part of the process for gathering this information users are being actively encouraged to create dataset metadata using GeoDoc tool – found within the GoGeo area on the EDINA web site. This utility is used to create standards compliant dataset metadata for upload into catalogues, eg, GoGeo! so that the data can be discovered, evaluated and possibly reused. Note that you need to have UK Access Management credentials to use GeoDoc.

Users that we have identified so far consist of academics, researchers and students within IGES in Aberystwyth University, and from the Centre for Catchment and Coastal Research (CCCR) which is a consortium of Aberystwyth University and Bangor University. Users will also include researchers from Forestry Research in Wales, Forestry Commission and staff from the Countryside Council for Wales (CCW). Within these bodies individuals have been identified and we will develop these as user case studies. We are currently collating their data sets and identifying their relevant uses and needs.

In the following weeks we will collate and input data sets some of which are complete whilst others are work in progress. These data sets will come from the individual user case studies. The user case studies will be something like the following:
• IGES Academic/Researcher
• IGES/CCCR Academic/Researcher
• IGES MSc Student
• IGES PhD Student
• IGES Digital Map Librarian
• Forestry Research Researcher
• CCW Senior Reserve Warden for Dyfi Biosphere Area

A ‘shopping list’ of data sets that are either not currently available to these users (and which they would like access to) or are difficult to find will also be identified and collated. Already we have had requests for biogeochemical data sets from IGES/CCCR, and for remote sensing data sets from Forest Research. It is hoped that Welsh Assembly Government may be able to help with some of these data and that, even if their use is restricted, we may be able to offer access to using web services secured using Shibboleth (the software underlying the UK Access Management Federation).

So far we have identified from the academic/researcher evidence that both academic staff and students would find the Web Map Service (WMS) “factory” application useful as a research and teaching tool. It has also been suggested by one of the academic users that an undergraduate module could be developed around the use of open geospatial standards. It was agreed that using the GeoDoc metadata input facility would generally improve data management practice for research projects.

Any comments from the user case study individuals or other potential users would be much appreciated to ensure the relevant uses and needs of all involved in this project are identified. The information will feed into the development of the mapping application and the identification of future requirements.

