An Enhanced Right to Open Data
New and richer flows of data from organizations in the public space could enrich democracy and might improve effectiveness and efficiency. More public knowledge (one definition of “transparency”) could stimulate debate about services and money, increase vigilance and arm scrutineers. But more and better data will not in and of itself bring more accountability or improve services. We must not reduce volume of information with better decision making. Data must become information: it must be grasped and absorbed. Information has then to be applied. Accountability and public satisfaction could move together in a virtuous circle, provided the public understands the data proffered; provided those releasing the data themselves understand it and its potential; provided its quality and accuracy are guaranteed.
Open Data prompts questions about public capacity. The government’s response to proposed changes in the school curriculum allowing many more young people aged over 16 to continue studying mathematics and stats shows the government itself accepts the public need to be better equipped. Open Data abuts the contention that those leaving education have to be better prepared to deal with data and numbers, for their own sake as employees as well as in their lives as citizens and family builders (dealing with energy tariffs, insurance, pensions and broadband offers). Open Data links with moves to improve the quantitative skills of university graduates.
As important as the volume of data are presentation and “visualization”, the discipline of making data more intelligible. In the jargon this means paying attention to metadata and data polishing. It puts emphasis on intermediaries to help the public make sense of data. Statisticians and academics are fond of the term “metadata”. This directs attention to the explanatory material that ought to accompany data release. Another missing term is narrative. What the public want is data to tell a story about the performance of schools, crime in their area and so on. Open Data needs to look at who writes and who puts out these stories. Another key term is visualization – covering the many ways in which data, especially quantitative data, can be projected, for example exploiting the graphical resources of the web.
Data release should anticipate the sense the public will make of what is presented and how they might use data. Each department and agency should subject itself to a “data challenge”: is the information intelligible? Translating data into information that is fit for public consumption requires good analysis and interpretation, which is lacking in many councils. The question does not capture the dynamism and spirit of opportunity and innovation that ought to accompany data release. Departments and agencies should relish the chance to share their work (knowledge) with the public and make explicit efforts to present it in ways the public can grasp. The value for money of data release has to be denominated in terms of accomplishing the organization’s wider public purpose and be accommodated in its notional or actual budget for accountability.
The public tend not to distinguish whether a service provider is public, non-profit or private, though they need to know how it is paid for and how it accounts. A rule of thumb for the application of Open Data is the ratio of public support to turnover (including implicit public support): any positive figure would tip the organization into the category where Open Data applies. We want a culture in which elected representatives and service deliverers feel open data accomplishes their purposes. Open data should not become a stick with public organizations are beaten, by emphasizing the way data might be used to punish or find defects; instead, it should be celebrated as the basis for “co-producing” services and engaging the public.
We need incentives and awards celebrating data release and data sharing. Instead of a (static) culture of rights, public organizations should make a dynamic commitment to data collection, handling and release. We could draw on past efforts to identify and praise organizations doing well to account for themselves in the broadest sense, including data sharing.
The definition of the key terms, whilst providing a wide scope for interpretation, should also include definitions of that which is not subject to the Open Data approach so as to make it clear from the on-set what datasets (if any) are considered out of scope for public sector organizations. In relation to non-government bodies providing public services, information about aspects unrelated to the delivery of their public service function are not in scope, does that imply that ALL public sector data is in scope? We assume certain public sector data will be considered out of scope and a clear definition of what can be expected should be provided.
In our opinion, existing legislation would still have a role to play with regard to the publication of data. Tests with regard to the publication of salary and personal information should also continue to be applicable in certain situations where the publication of data may compromise personal or business relationships. The costs incurred by private sector organizations in responding to management information requests from public authorities, or requests directed at public sector organizations is not considered by the individuals making these requests.
If it is Government’s aim to establish a common data set against which all public bodies would have to provide a base data set. Consideration should be given to establishment of a baseline data set against which data is provided for free. Any information requests outside of the defined datasets could be chargeable with a proportion of the fees payable supporting the potential Public Data in support of their activities and costs.
Whilst Government should ensure that the requirement to provide data does not create unnecessary burdens on public bodies, particularly those organizations working with small budgets. There is a view that any organization that is in receipt of public funds should have a responsibility to account for how those funds are spent. This would include all organizations and any other bodies in receipt of public funds. The issue would be where to draw the line, i.e. in the event of private suppliers to public bodies who engage sub-contractor organizations to deliver work, should the subcontractor also be subject to such measures.
The opportunities for public bodies to hide behind claims of excessive time to produce must be minimized and support through a regulatory framework that compels publication in all but the most exceptional circumstances (i.e. to protect national security, personnel information etc). If appropriate legislative and regulative measures could be established by the responsible body to establish industry wide reporting criteria (i.e. for education, health, defence etc). Government should also consider a “published by default” scheme that could be written into service provider contracts to those organizations supporting public bodies, so long as the organization remains independent and is not seen to be at the sole service of government. Any other option would require the establishment of a new body and the associated costs of doing so should be taken in to consideration in the current economic climate.
Data should only be relevant to the service being provided, anything that does not directly influence an entitlement to a service or funding should not be collated. Open data should not be cross-referenceable between data sets from different organizations. Any information that would support an ability to cross-reference information against another data set provided by another public body must be carefully considered. Whilst the ability to cross-reference data is vital between departments, the ability to make such comparisons using Open Data must be removed to ensure privacy.
The potential to create additional burdens on public bodies as a result of implementing Open Data needs to be considered carefully. The exercise should encompass those bodies where the potential for creating unnecessary burdens is greatest, and consideration should be given to this both informally and formally. Whilst the need for transparency and visibility of how public funds are spent is important, it should not be delivered so as to create a substantial increase within public bodies or add to the costs they incur to ensure compliance. The only way to ensure it does not is to benchmark the existing burden on these organizations prior to the introduction of any supporting requirements under the Open Data initiative. This could be achieved through taking a phased view to establishing reasonable boundaries and limits in cost to produce. In summary, it would not be possible to measure the impact of introducing Open Data unless there is a clear understanding of what investment is being made in complying with existing requests.
There are a large number of contracts already in existence with considerable time left on them. Whilst introducing new Open Data standards with regard to new contracts would be relatively straightforward (once the legal and regulatory hurdles have been cleared), the legacy contracts should be amended to reflect any new requirements in support of Open Data. The requirements could be transitioned into existing contracts through change control. Where a supplier refuses to adapt an existing contract, Government should seriously consider whether or not the response is acceptable, and where this is not the case, the potential to re-tender a particular contract in the public interest should be considered.
Rather than undergo a costly exercise in defining a new set of high and common data standards, government should consider identifying current areas of best practice through the benchmarking of public bodies and existing data structures. We should seek to agree definitions of data terminology so that all organizations subject to the requirements of Open Data have a consistent understanding of common definitions for data fields and the content within them. Misinterpretation of information will be reduced and the ability to compare and integrate cross departmental data analysis will be increased. Consideration would also need to be given to the adoption of common reference standards.
Government provides many different public services, whilst some standards for data collation can be made consistent to allow for greater comparison of perceptions of individual services, consideration also needs to be given to specific departmentally related data in order to identify local issues and areas of weak performance across individual departments and other public bodies. There is a need for balance so as to include the benefits of service providers and public interest in the public services they receive. The public (and other interested parties) require the localized information to make decision on services and in order to support delivery of the big society. However, public service providers and government itself has separate needs for consistent data in order to make informed decisions and direct comparisons across public service providers, markets and costs. Many organizations develop manage and analyze their own user experience initiatives, there is significant cost in the localization of these initiatives and government has an opportunity to centralize the definition of information around user experience whilst also reducing the local development costs associated with this.
The accreditation of information intermediaries will only work effectively if the organization responsible for the delivery of that accreditation (1) has enforcement powers to deal with ineffective or poor performance from accredited suppliers, (2) has some influence over the maximum costs that accredited organizations charge for access to information and (3) Maintains an ability to adequately meet the needs of information providers (public sector and related service providers) and also the needs of the proposed information intermediaries. To support this objective would require an organization with independence and impartiality so as to ensure that public bodies and public service providers are treated mutually and identically. To support this objective a set of detailed and coherent definitions and guidelines would need to be produced to clearly communicate to organizations what is considered to be private information and what would be considered as confidential with regard to protecting national security.
In order to ensure consistency in application, it is essential that an independent body or reviewer be appointed to oversee the application of privacy and confidential data so as to ensure that public sector organizations are not using the ability to withhold data under those categorizations unduly. This role could potentially be fulfilled, but consideration should also be given to using an existing organization to oversee delivery. Departments and public service suppliers should be aligned to a common set of objectives and requirements related to the provision of Open Data, and these should be applied and monitored independently and with a consistent application of requirements, and where required penalties to support compliance.
Whilst Board-level accountability already exists in support of a number of initiative and requirements such as health and safety, it must be recognized that it is not necessarily these individuals dealing with the day to day requirements that supports the development and application of supporting policies within organizations. The nature and scope of information that will be required to be covered with regard to Open Data is likely to require considerable support within public bodies and other organizations service providers. Unless we provide considerable detail in relation to specific requirements, consistency of data sets, consistency of interpretation, file formats etc., organizations will interpret requirements differently, delegate responsibility and create multiple layers of input and ownership. Board level responsibility would make sense, but it must also be recognized that the need to comply with and provide Open Data will in itself lead to increased costs in order to provide the information. Government needs to carefully consider the impact with regard to man-days, overhead costs and additional burden on the public purse that will be created as a result.
Without a sanctions framework it is hard to see how the Open Data agenda would operate. There are many initiatives that have been tried and failed as a result of ineffective or limited enforcement. Public sector consumption is huge, the organizations that will be required to comply with the Open Data requirements a large, complex and varied in the consumption of products and services. The scope of Open Data is equally large, would require information analyst support and given pressures to deliver business as usual, it is hard to see how public bodies would maintain a focus on the delivery of Open Data if there was not some form of sanctions framework in place. However, any such framework would need to be consistently applied and enforced.
If a single organization would be responsible for overseeing data definition, collation, publication and licensing, why would there be any need for dedicated sector transparency boards? Surely, this is only creating another layer to deliberate and interpret any public data definitions. Government should not provide the opportunity for individual sectors to deliberate the confidentiality or provision of data, as long as they have the correct legislative, sanctions and frameworks in place support to ensure public bodies comply with and are bound by the requirements of Open Data why would you need another layer to deliberate sector transparency. The only exceptions should be those stated that relate to private personal information and that withheld in the interests of National Security.
There is a need to establish a clear and well defined framework of data sets and date inventories that are applied consistently across public bodies and public service providers. Comparisons and meaningful analysis of data can only be achieved through the application of common definitions and measurements. The objectives behind the consultation on Open Data will only succeed if there is (1) an overarching body or organization with responsibility for delivery (2) A clear and well defined framework of data sets (3) a body responsible for monitoring and amending data requirements going forward and (4) establishment of a Open Data “data warehouse” to support the collation, storage and access of information in support of making it readily accessible. Failure to support the initiative appropriately will result in data discrepancies, non-compliance, inappropriate understanding of definitions and deterioration of confidence in the data from users and consumers.
The main issue with this is establishing the baseline. What data should be made “Open” across different Public bodies and departments? For all organizations in the public sector there is a vast amount of data that is gathered and required to support the effective delivery of those organizations. There are also a number of data items that are required to support existing voluntary or statutory reporting requirements. Herein lies the problem, not all of these data sets are in the public interest, some are personal and confidential, and some are internal data sets used locally. Under the auspices of Open Data, data which relates to the efficient and effective performance of public services should be given priority as should that which serves the public interest. The main issue here is the body or organization responsible for the setting of priorities for data sets for inclusion in a data inventory. The task is large and complex, a thorough understanding of how public sector organizations and suppliers are structured, their ability to store access and deliver data sets across the public services being delivered and the ability to compel public bodies and service providers to supply data in an appropriate and timely fashion all need addressing. Only an overarching organization would be able to do so effectively and even that is assuming it has the correct expertise, understanding and operational remit to do so.
Any organization operated through the use of public funds should fall under the requirements to capture, store and publish open data except where this involves the publication of personal data or data that is sensitive with regard to national security. As soon as this definition starts being diluted, the impetus and effectiveness of the Open Data requirements will be called into question. The definition of personal data and particularly that withheld under the auspices of national security must be clearly defined from the on-set.
The collation of data for the sake of it should be managed carefully. Individual organizations should be encouraged to collect the data that is essential to the delivery of their specific service. This will ensure that only pertinent data is harvested. The ability to cross reference data based on a common identifier must be minimized except for in relation to those services where such data is meaningful. The purpose and use of individual data sets within public bodies needs to be questioned. An overarching body could be the delivery agent with a remit for considering which is required and that which is unnecessary and provide appropriate guidance and if necessary, sanctions to ensure compliance.
Providers should not be allowed to “polish” data. Where appropriate, commenting on data or listing assumptions etc., should be encouraged so that the reliability and accuracy of data sets can be treated subjectively by the end user. Holding data due to concerns over accuracy or quality should only be permitted in cases where the data is so unreliable it would inappropriately affect comparison with other data sets from within the sector to which it relates. In such cases a body should retain the authority to work with that particular public body or supplier to raise data standards so that data can be published. It would be important though to ensure that such a body was given the appropriate authority to intervene and compel the organization to change.
Part of the existing problem is the plethora of public sector organizations, their respective websites and the inability to find data on them easily. Once defined, the only way to store the data and make it accessible easily will be through a centralized portal or repository. Any other solution would simply increase costs and cause confusion to data consumers. Accepting the vision behind the exercises relating to Transparency, Localism, and Open Data, government must ensure that the combined impact is visibility, accountability and engendering of trust from the public that government, public bodies and public service providers are spending public fund effectively. Therefore, data sets should be published at national, local and sector levels where public interest and the desire for data is highest.
Government should not publish data for the sake of it. Broadening the net of what is captured and reported will only create additional burdens on already tight budgets. The effort required to support several existing data sets in the public domain is considerable and the quality and inability to effectively cross reference or collate different groupings of such data already presents a considerable challenge to all concerned. Publishing relevant data sets effectively will encourage confidence and improve information access and usefulness. Get the existing stuff correct and more refined first, then consider the gaps that still exist and how best information can be delivered that enhances both the public’s desire for information and also provides a useful measure of an organizations effectiveness when compared to others within the sector, region or nationally.
That role is for the American Government itself to become a savvy user of the data being collated and generated. What data helps drive government policy on a public bodies effectiveness? How does government measure and rate its own performance? What data and in what format would provide the public and public bodies with effective information on which to compare services or performance? If the government itself does not make use of the information that will be generated in order to deliver service improvements and delivery, what is the point of having the data? This should not just be an exercise of “publish everything and let others interpret it” government itself has a responsibility to monitor, act on and improve data collation and development in order to provide the public with confidence on its performance, otherwise the exercise will be fruitless and simply another example of costly bureaucracy without delivering improvements and benefits. The principles outlined as a part of the Open Data consultation could provide benefit to the Government and the bodies seeking to use Government information. However, the generation of a clear, coherent and consistently applied scope for the data sets that are intended to be published present a considerable challenge.
Whilst common and consistent data will support removal of barriers to entry for Small Enterprises, publication of all data for the sake of it, in whatever format it is collated will increase the risks of unreliable analysis and will create a burden for Government with regard to cost and administration. To deliver this, it is essential that sufficient consultation is included with information providers, industry and other user bodies. Having a coherent regulatory framework will assist Government and users.
The government should consider mounting – in collaboration with the research councils – a campaign to counter the scaremongering that goes on about data use by public bodies, especially those concerned with the advancement of knowledge. The public should be encouraged to view two-way sharing data as beneficial (economically and cognitively). Data sharing can save money and lead to better policies. The apparatus of control should be filleted and prevented from blocking for example the re-use of data collected by public organizations and data sharing between public bodies.
Open data can lead to improved organizational performance and stronger relations between the public, as citizens and service consumers, and providing bodies. Therefore any additional costs associated with data release and data sharing should be regarded as investment. The key link is between more openness and more accuracy. The government should find out how the public are using the data already released (for example on local authority spending) and consider establishing a center of excellence (which might be based at an existing public body) on “usability”.
The best information intermediaries are public bodies themselves. They should anticipate how data is going to be received and used and tailor presentation accordingly. The value of invigilators of the quality of public data has already been proven. Because independence is going to be a valued attribute of any organization subjecting official releases to scrutiny or criticism, it will best be situated at arm’s length from the government. The government might consider endowing a non-profit organization to do this work. Open Data should be part and parcel of performance and monitored accordingly. Government’s role includes identifying and extolling good practice, which includes data and information handling in the round – i.e. the ways in which information is collected from the public as well as how it is passed out. Such government bodies have made commendable efforts to open up operations and finance to public view, and already release large quantities of data.
Data culture should be a board item, with responsibility diffused among non-executive and executive directors. Non-executives in particular should constantly be putting themselves in the place of the public and assessing the intelligibility of data flows. Open Data should be characteristic of good public management. Its value lies in interaction between public organization and public and “rights” could ossify what will be a dynamic and evolving relationship. Data inventories are probably best put together at a scale bigger than that of the individual organization, since public organizations a) share common data sets and b) collect similar or the same data from the public. The simple test is: is the data necessary for achieving the organization’s stated public purpose?
America needs a data strategy. One of the missing ingredients of the Open Data initiative has been that – preparing a comprehensive analysis of what the states (and its various dependencies, including private firms) need to know. Again, this is a dynamic conception. The states need to anticipate knowledge needs for future years and conduct studies and data interrogations with the population of the future in mind. The contours of the state and public services change and with them the “cognitive” bases of government. It follows that some data sets will be anachronistic and should be subject to periodical review.
Would any self-respecting board calmly say we don’t mind if performance data is dubious? Data labelling is important. Polishing data costs money and takes time. “Quick and dirty” data may do, on occasion. But it needs to be identified as more or less reliable. It would not be hard to put together a “grid” attesting to the quality of data, formed from the professional opinions of statisticians, and by the views of those involved in assembling and processing data for government (chief scientific advisers, networks of analysts).
The public are entitled to see an assessment of the reliability and accuracy of data presented to them. They deserve, too, some account of the significance of data. Low quality data can be significant just as high-quality material can be of trivial importance. This returns to the question discussed above: those who release data should be duty bound to comment on its worth – metadata matters as much as data.
The question of departmental vs central portal is less pressing than putting together a data strategy. A starting point is assessing government’s knowledge needs. The strategy would also embrace release procedures and archiving. Storage protocols, access and search engines would be part of this. Much data is held and is subject to release by government to local authorities and arm’s length bodies. Their release plans might be autonomous, but they could be required to observe templates written as part of a national data strategy and organizations should be allowed to prioritize datasets according to their business plans.
The government needs a “clever center” for Open Data, staffed in part by people who understand the specifics of departments and their data economies and government away from the center. A precondition for innovating in Open Data is, to repeat, minimum levels of public understanding, both of the data people share [to] government and [from] government.
Jeff C. Palmer is a teacher, success coach, trainer, Certified Master of Web Copywriting and founder of https://Ebookschoice.com. Jeff is a prolific writer, Senior Research Associate and Infopreneur having written many eBooks, articles and special reports.