Initial thoughts on data.gov.my

For at least a year, Open Data conversations have been happening actively at the intersection of Malaysian government and citizenry. Way before that, on the “getting shit done” front, Sinar Project has been doing what they can with what little they had, with really cool results like their scraping of the Construction Industry Development Board database. On the “evangelise within government” front as far as I’m aware MDeC has been investing a lot of energy, in particular Pak Mei Yuet’s tireless team.

Now the national Open Data portal has launched. There are a few issues I’ve noticed with it:

  • Their datasets number is inflated, but probably not maliciously so. Take the entries from the Ministry of Youth and Sports, for example. They have five entries listed, but three of them are titled “Aplikasi Pemetaan Belia Malaysia”. These are not duplicate entries, however. They are three different representations that the ministry wanted to make available: a map, another map, and (thankfully) a webform to extract a dataset which (even more thankfully) you can eventually download as an xls file. I imagine the ministry had to separate each entry because the CMS wouldn’t allow them to list more than 1 “Download” link per entry. This is something that needs to be addressed, especially before people start propagating falsehoods like “there are 121 datasets available” because there simply aren’t. 
  • What does the number of datasets really mean anyway? For instance, MOHA’s tab has entries on drug data, but they’ve made it available on a monthly granularity, which gets counted as 12 on this site. In many cases having a large annual or multi-year dataset is a natural thing, but an annual dataset would only count as 1. Counting the number of datasets this way is a vanity metric at best, and it could even be an impediment if it implicitly or explicitly starts to be treated as a KPI. There needs to be a more effective way to measure progress in this regard.
  • There’s the semantic question of what constitutes data anyway? Purists will argue that data must be raw and visualised maps as those provided by the Ministry of Youth and Sports are rubbish – I think maps can be useful, but only if they are made available in the context of facilitating data exploration with the option of extraction. NICTA has something like this in beta for Australia, which Open Data fans generally regard very favourably. I haven’t spent time with the maps provided by the Ministry of Youth and Sports to decide if they are really value add or mere eye candy.
  • Machine readability is missing in the vast majority of cases. The usual black sheep is well represented here: PDF files everywhere.
  • Even when machine readable files are provided, there is lack of standardisation – e.g. in the above-mentioned Ministry of Youth and Sports example, most would argue that properly formatted CSV files should be offered instead of XLS files.

That’s just what I’ve noticed from a quick skim, without attempting to actually do anything real. So yes, there’s a few bugs to be sorted out. But I personally am very glad they got this online because – as they say – you should always be slightly embarrassed by v1.0. The alternative approach would have been to wait probably indefinitely for convergence on all potential issues, in which case we’d be worse off with nothing than we are with this something.

On a more “meta” level, the Malaysian government is not known for transparency (their handling of a recent international crisis makes that plain enough for all to see) so there may be some who will be tempted to dismiss this as being insincere and therefore devoid of value. A more specific argument might be to say that having an Open Data policy without a Freedom of Information (FOI) Act to back it up is putting the cart before the horse. I don’t buy that because the state governments of Penang and Selangor have had FOI acts for years now, with no open data to show for it as far as I’m aware.

But from a general governance/philosophy standpoint, I agree that the Malaysias government and the Barisan Nasional is not the embodiment of openness, and I doubt data.gov.my will be seized as an opportunity to redeem themselves. It is nevertheless a useful resource, and one with the potential to become a lot more useful, for researchers, journalists, students, and maybe even app developers.

It’s difficult to say that data.gov.my constitutes a really good start. But it does constitute a start, which is definitely good. All things considered, they managed a minimum viable product. What they’ve put up isn’t completely frictionless to work with, but there is enough there to trigger some data projects for sure.

Now I only hope there’s enough uptake to justify more effort by MAMPU and the various data stewards and custodians.


A few disclosures that I thought could be relevant: In my capacity as founder of Big Data Malaysia, I’ve spent some time (all unpaid) chatting with Malaysian government types (mainly MDeC) about numerous things including Open Data. I was a volunteer in DAP general election campaigns in 2004/2008. I am a Malaysian citizen. I currently reside abroad.

Advertisements
This entry was posted in Uncategorized and tagged , , . Bookmark the permalink.

One Response to Initial thoughts on data.gov.my

  1. Pingback: Early-days analysis of data.gov.my | tramdas

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s