Platypus Innovation: 2014

13 October 2014

Broken by default: Java 8 is not quite backwards compatible

Java 8 introduces several nice features, one of which is the default keyword. This is close to having mix-ins (aka traits; inheriting code from multiple parents) which is great. It also lets you add new methods to an old interface without breaking existing implementations. Or that's what it's meant to do.

But Oracle have broken backwards compatibility.

It is an error for a class to inherit a default method from two parents. If that happens, then the class must override the default method to specify what happens.

But if the default method involves a Java 8 class, then implementing it will break your code for Java 7.

I encountered this because we have a class which implements both List and Set. It's now impossible to do that if your code has to work across Java 7 and 8. To work in Java 8, it has to implement spititerator(), but that will break Java 7 which knows nothing of the Splititerator class.

23 September 2014

How Social Media Failed Scotland

We've just seen (and been part of) a great democratic event in Scotland -- the whole country engaged in politics culminating in 85% of the population voting at the referendum.

As we all know it was a close race with 55% voting No (pro-Union), 45% Yes (pro-independence). But that's not the picture you'd get from Twitter, where 90% of tweets were pro-independence.^[1]

That's a failure for social media -- a great debate took place, but not on Twitter where nearly half of the people were absent. Unwilling to take part in an unwelcoming space?

Age of voters does not explain this. An opinion poll taken on the day shows the No vote was strongest amongst older voters^[2], less of whom will be on social media. However the other age groups showed closer to an even 50/50 Yes/No voting -- nowhere near the 90%+ observed on Twitter. Even taking age into account, social media was one-sided in an unrepresentative way.

Social media worked fantastically for the Yes campaign. The stream of Yes posts helped to build and maintain energy, spread pro-Yes knowledge through the network, shore up support, challenge No voters' thinking, and create a peer group effect in favour of Yes. Undoubtedly, it helped Yes to close the gap towards almost winning. However that very success had a downside: The No voters were absent -- until polling. They largely stayed out of the online debates, and you can't influence people if they aren't engaged.

Cybernats and subtler

Much was made in the mainstream press of Cybernats, abusive Scottish nationalists active on social media. Having examined a lot of tweets, I conclude: Cybernats do exist, and in greater numbers than the equivalent unpleasant No-supporters (Unitrolls), but they were still a very small minority. There were also people who, though not abusive, were aggressive -- again relatively small in number and effect. Mostly people did, as Gordon Brown put it, "disagree without being disagreeable".

A larger effect was (I think) the self-reinforcing momentum that Yes built up, which closed down the space for No voters to speak. Although the Yes and No camps were roughly the same size, the Yes were more passionate. A spiral emerged, whereby Yes supporters received considerable community reinforcement online, hence building that community, whereas No posts received more disagreement and critical questioning, acting to deter them.

So it wasn't really the Cybernats or any form of foul play. Social media became welcoming to one set of people and off-putting to the other. The No voice was self-silenced by the dominance of the Yes voice.

What makes for better debates?

This brings out an inherent weakness in forum debates -- that if a position should become too dominant, that very fact acts to shut down debate, as the other parties quietly leave.

So how do we create spaces where debate can flourish? Comparing Facebook and Twitter, we see that privacy and community encourage people to speak their mind. Unsurprising, but worth learning from.

The problem with Facebook debates is that they easily become closed "bubbles", rather than open debates. Each bubble acts as it's own echo-chamber, with the illusion that everyone agrees. In this case, the debate cut across networks, so most people knew Yes & No voters. Even so Facebook's filtering algorithms quickly act to create bubbles-within-bubbles. Facebook tries to select things-you-will-like and filter out things-you-won't-like, which is great for seeing/avoiding cat photos, but it's not a formula for real debate. In other debates, you might never hear the other side's point of view.³

Is there a third way -- something which can combine the open knowledge sharing of Wikipedia with the community aspect of Facebook?

I have some vague thoughts on how this might work (drawing on ideas from Lucas Dixon, Colin Fraser& Ben Young's old KenYersel project, http://www.kenyersel.org/), but nothing concrete. Any suggestions?

[1]: Counting #voteno / #voteyes from 1st August, 89.2% of vote-x tweets were for Yes. Even this is actually an under-estimate of the Yes dominance. The #voteno tweets were disproportionately from outside Scotland, and there were also considerable Yes supporters jumping on the #voteno hashtag to reach-out-to / mess-with the No audience. Anecdotally, the picture was more balanced on Facebook, where greater privacy and friendship groupings created a more open debating space.

[2]: Summary results from Lord Ashcroft's poll of 2,000 people

[3]: See Gilad Lotan's fascinating analysis of social media activity around the recent Israel/Gaza conflict. Data analysis of how news around the Israel-Gaza war was handled in social networks. TL:DR; People talks to like-minded folk who reinforce each other's viewpoint. Few people hear what the other side is saying.

Picture from http://www.anticapitalistes.net/

15 August 2014

Some useful Eclipse <-> IntelliJ key bindings

As a veteran Eclipse user working with IntelliJ, here are the key-bindings I'm finding invaluable.

A nice surprise: IntelliJ has a key setting which mimics much of Eclipse. So your fingers don't have to relearn everything:
In File->Settings->Keymap, choose the "Eclipse" key bindings

A few vital key combos are different though:

Auto-complete / fix
Eclipse Control+Space
IntelliJ: Alt+Enter

Open resource / type, for quickly jumping to a file
Eclipse: Shift+Control+R or Shift+Control+T
IntelliJ: "double shift" (hit Shift twice)

12 August 2014

Documentation for the Educated Stranger

When you're documenting an object/method/variable, ask: What would a stranger -- who knows basic coding & a bit about our project -- need to:

Get or make an instance of this object?
Use a class or function?
Edit this code?

Good documentation is invaluable. It enables new people to join the team, and it takes some of the pain out of code maintenance. It's a balancing act: It should be short yet clear. And of course, you don't want to spend too long on it.

The clearer you can make the code itself, the less documentation is needed (see the notes below on avoiding pointless documentation).
But: readable method and variable names alone are not enough to make code self-documenting.

Things to document:

Assumptions a method makes about the state of things.
Exceptions that may be thrown, and why.
Parameters whose meanings aren't obvious from the names. But there is no need to document the obvious.
Methods that may return null, and why.
When parameters can be null, and what that means.
Object lifecycle (i.e. how it is wired up and disposed).
Typical calling patterns -- how methods fit together.

The SoDash/Winterwell House Style

Prefaces you might use:
- FIXME for we-really-should-do-this.
- TODO for less urgent tasks.
- ?? for questions / things your uncertain about.
- NB for tangential notes, e.g. explaining why you didn't use alternative method X.
You can sign comments tweet-style with ^name or ^initials.
If you see code whose purpose or behaviour you can't understand, add a comment. E.g. "?? is this number a probability? ^DW April 2014"
Use assert in java, or SJTest's assert() and assertMatch() in javascript as a way to simultaneously test & document in-code assumptions.

How not to document

It's not about the amount of JavaDoc / JSDoc. It's the quality. Here's an example of pointless documentation:


/** {Object} */
var displayData;   

/** Get the display data for the part. 
* @param {string} url The url 
* @returns {Object} the display data.
*/
function getDisplayData(part, url)

How weak is this?

There's no point in doc which just restates the function/variable name. You are not writing for idiots.
Type {Object} tells us very little! What kind of object is it?
A common case is where Object is a lookup map, in which case
say what the keys are. Consider using a name which clearly describes the lookup, e.g. tagsetFromTag is clear, whereas tagsets isn't.
Parameters: Can they be null? What happens if they are?
The input parameter url in this example is ambiguous: is this an absolute url, a relative url, or does it not matter because both will give the same output, or does it matter but which to use is specified elsewhere?
Return value: can it be null?
Lifecycle: when can this be called? E.g. does it only make sense after the page has done some initialisation steps.

Photo: Book sculpture by Daniel Lai, "Kenjio"

29 May 2014

Welcome to my blog-site Platypus Innovation.

The platypus caused consternation, shattered existing categories. It's existence was undeniable, but how should taxonomic theory be adapted to accommodate this uncomfortable fact?

This blog is also hard to classify. It loosely follows the interests and activities of Winterwell Associates, but it also includes personal material. Topics are likely to range from business affairs to new media via abstract mathematics.

One purpose of this site is to test out some of Winterwell's web technology. This site will sync with my SoDash account, which unlocks some rather unique features, though you're unlikely to encounter them during normal browsing.

Where will this blog go? From humble acorns, great oaks grow. But what if we've planted peanuts by mistake? Or genetically modified acorns that will turn into Evil Oaks? Only time will tell.

6 May 2014

Javascript/html templating: underscore.js for the win

Man cutting steel with a template (cc) Washinton State Dept of Transport

After experimenting with Javascript templating libraries, we decided to use underscore.js. It scores over JQuery, Moustache and others in having a very clean and simple relationship between templating and normal code.

Underscore templates provide spaces where you can drop into javascript code - that's all they do, but it works better than trying to do more. Crucially, this gives you a lot of freedom - the full freedom of javascript.

For example, here is a simple loop using underscore:

<% for(var i=0; i<3; i++) print("

Line "

+i+"
"); %>

That middle section isn't the prettiest, but it's something every web developer should already understand -- and know how to write.

Compare this with the approach taken in JQuery templates:

Line $i

{{/each}}

This looks a little bit nicer -- but what are these new magic {{}} tags? Plus we need to pass in the list `[{i:1},{i:2},{i:3}]` in order to use it. And we have less ability to build richer more complex templates.

Hence, by taking a simpler code-based approach, underscore.js templates actually end up being more widely editable and more powerful.

30 April 2014

Lessons for code documentation from product design?

Book sculpture by Daniel Lai, “Kenjio”

Good documentation is hard to do. It's a balancing act: It should be short yet clear. And of course, you don't want to spend too long on it.

I try to keep 2 high-level questions in mind, and write notes as I go which answer these:

1. What should a user of a system / class / method know?

This guides writing of javadoc and higher-level documents.

A "user" here is a developer who will call on the class/method, but won't look inside it. E.g. if you code in Java, then you're probably a user of java.util.List, and their (good quality) javadoc is aimed at you.

This user wants to understand: The purpose of a class/method, the inputs, the outputs/effects. Where it fits in the application's life-cycle.

2. What should a future developer who will edit the code of that system / class / method know?

This second "user" wants a few more details: Design choices & reasons, a summary for complex bits, any dead-ends to avoid. This last -- making notes on dead-ends and misadventures -- is especially useful when you return to old code & wonder why-did-I-do-it-that-way?

At each level, think about the person reading your documentation -- How have they got here? That is, what do they already know, and what might be strange to them? What are they trying to do? What do they need to know to do that (safely & well)?

Essentially, let's take the ethos of user-centred design -- and apply it to documenting code, by thinking of the developer as a form of user, and documentation as a product.

17 February 2014

There are no AAA databases

It's a mistake to believe absolutely in uncertain things. That's one of the lessons of the financial crisis. Uncertain loans were dressed up as triple-A reliable assets, but it turned out to be wishful thinking.

Dice bag (cc) KaptainKobold@Flickr

I see similar practices in databases and business intelligence.

We all know that databases contain errors. The errors come from many sources: data is mis-entered, or it was accurate but people move on, or the database schema was changed, but not all the data was correctly updated, or two databases are merged, but the join is dodgy: same name doesn't always mean same person. I've yet to encounter a database that didn't contain errors.

Everyone knows this. And yet people build business processes that assume the database is 100% correct. Even best practice in data analysis is only to try and limit errors entering the system -- but once they're in, the mistakes can run free.

In business intelligence, we see claims that everything can be measured. Claims that are plausible & we'd like to believe. All too often it's over-confidence and over-selling.

Accepting uncertainty does not mean giving up on measurement. It just means accepting errors are part of measurement. Once we accept that, we can deal with it. We should estimate the things we cannot directly & accurately measure. But remember that is an estimate. And know how good that estimate is, and how much that affects your decisions. There are cases where the-right-order-of-magnitude is fine, and others where even 99% accuracy isn't good enough.

It's especially important to know the blind-spots in your KPIs -- the things you can't properly measure. And there are always blind spots.

Anyone who promotes KPIs and ROIs without talking about errors is selling something unreliable. It's easy enough to hide uncertainty & inaccuracy - but you pay the cost down the line with interest. Remember the AAA sub-prime loans -- not all that glitters is gold. We ignore uncertainty at your peril.

The salesmen of over-confidence cannot have it both ways: if data is important, you'd better be honest about its quality.

10 February 2014

Geocoding Twitter: Who cares about New Zealand?

Geo-coding is where you take descriptions of a place -- such as the location people give out on Twitter -- and work out where on Earth it actually is.

Geo-coding is not an exact science. E.g. "Cambridge" could refer to a city in the UK, or one next to Boston in the USA (and oddly, both cities are home to world-class universities). And that's the easy stuff. Twitter locations can be... interesting -- such as "wherever there is dancing", or "city of purple".

So geocoding software can be forgiven for making occasional mistakes & odd choices. Here are some we've found:

Heaven is in Iran, but Paradise is in the USA.
Iran also counts as far far away.
Reality is in India
Gun Shaped State is Oman, as is Somewhere Yu Aint! (I suppose there's a kind of logic here-- for most of us, Oman is somewhere we aren't).
Wonderful Island is Taiwan...
...but Whore Island is somewhat cruelly identifed as Iceland
Atop of a Whovian Bum is in Azerbaijan

My favourite malapropism:
Who cares? and Who knows? mean you're in New Zealand

NB: We currently use a mix of Google, Yahoo & Twitter geocoders (each of which has it's own strengths and weaknesses). The examples above come from one of those three. It's usually Google -- who have the largest most varied coverage -- for the random. We are developing our own in-house geocoder based on Open Street Map data.

Spam, spam, lovely spam

This comment was such a great piece of spam, we had to publish it (minus the url).

Do you have а spam isѕue оn thіs ѕite;
I also аm a blogger, аnd Ι wаs wanting to knoω your ѕіtuation;
many of us hаve developeԁ some nicе mеthods
and wе агe looking to trade strаtegіes
ωith other folks, bе suгe to shoot me an emаil
if іntereѕtеd.

Нere iѕ my homеpagе viagra

Javascript Enums

Enums are a useful way to handle a set of constants. They protect against typos and bad-values, help to spot missed cases, and make refactoring a lot safer.

Javascript does not have enums. So how can we get the same benefits?

Enum.js is a simple class which gives you enum-like behaviour.

Example: Instead of writing e.g. if (sibling == 'BROTHER') ... throughout your code, write Sibling = new Enum('BROTHER SISTER'); then use if (Sibling.isBROTHER(x)) ....

Here's the Enum.js code as a gist

Platypus Header

Platypus Innovation Blog