NHS Hackday 2015

This weekend I took part in an incredibly successful NHS hackday, hosted at Cardiff University and organised by Anne Marie Cunningham and James Morgan. We went as a team from the MSc in Computational Journalism, with myself and Glyn attending along with Pooja, Nikita, Annalisa and Charles. At the last-minute I recruited a couple of ringers as well, dragging along Rhys Priestland Dr William Wilberforce Webberley from Comsc and Dr Matthew Williams, previously of this parish. Annalisa also brought along Dan Hewitt, so in total we had a large and diverse team.

The hackday

This was the first NHS hackday I’d attended, but I believe it’s the second event held in Cardiff, so Anne Marie and the team have it down to a fine art. The whole weekend seemed to go pretty smoothly (barring a couple of misunderstandings on our part regarding the pitch sessions!). It was certainly one of the most well organised events that I’ve attended, with all the necessary ingredients for successful coding: much power, many wifi and plenty of food, snacks and coffee. Anne Marie and the team deserve much recognition and thanks for their hard work. I’m definitely in for next year.

The quality of the projects created at the hackday was incredibly high across the board, which was great to see. One of my favourites used an Oculus Rift virtual reality headset to create a zombie ‘game’ that could be used to test people’s peripheral vision. Another standout was a system for logging and visualising the ANGEL factors describing a patient’s health situation. It was really pleasing to see these rank highly with the judges too, coming in third and second in the overall rankings. Other great projects brought an old Open Source project back to life, created a system for managing groups walking the Wales Coast path, and created automatic notification systems for healthcare processes. Overall it was a really interesting mix of projects, many of which have clear potential to become useful products within or alongside the NHS. As Matt commented in the pub afterwards, it’s probably the first hackday we’ve been to where several of the projects have clear original IP with commercial potential.

Our project

We had decided before the event that we wanted to build some visualisations of health data across Wales, something like nhsmaps.co.uk, but working with local health boards and local authorities in Wales. We split into two teams for the implementation: ‘the data team’ who were responsible for sourcing, processing and inputting data, and the ‘interface team’ who built the front-end and the visualisations.

Progress was good, with Matthew and William quickly defining a schema for describing data so that the data team could add multiple data sets and have the front-end automatically pick them up and be able to visualise them. The CompJ students worked to find and extract data, adding them to the github repository with the correct metadata. Meanwhile, I pulled a bunch of D3 code together for some simple visualisations.

By the end of the weekend we established a fairly decent system. It’s able to visualise a few different types of data, at different resolutions, is mostly mobile friendly, and most importantly is easily extensible and adaptable. It’s online now on our github pages, and all the code and documentation is also in the github repository.

We’ll continue development for a while to improve the usability and code quality, and hopefully we’ll find a community willing to take the code base on and keep improving what could be a fairly useful resource for understanding the health of Wales.

Debrief

We didn’t win any of the prizes, which is understandable. Our project was really focused on the public understanding of the NHS and health, and not for solving a particular need within (or for users of) the NHS. We knew this going in to the weekend, and we’d taken the decision that it was more important to work on a project related to the course, so that the students could experience some of the tools and technologies they’ll be using as the course progresses than to do something more closely aligned with the brief that would have perhaps been less relevant to the students work.

I need to thank Will and Matt for coming and helping the team. Without Matt wrangling the data team and showing them how to create json metadata descriptors we probably wouldn’t have anywhere near as many example datasets as we do. Similarly, without Will’s hard work on the front end interface, the project wouldn’t look nearly as good as it does, or have anywhere near the functionality. His last-minute addition of localstorage for personal datasets was a triumph. (Sadly though he does lose some coder points for user agent sniffing to decide whether to show a mobile interface :-D.) They were both a massive help, and we couldn’t have done it without them.

Also, of course, I need to congratulate the CompJ students, who gave up their weekend to trawl through datasets, pull figures off websites and out of pdf’s, and create the lovely easy to process .csv files we needed. It was a great effort from them, and I’m looking forward to our next Team CompJ hackday outing.

One thing that sadly did stand out was a lack of participation from Comsc undergraduate students, with only one or two attending. Rob Davies stopped by on Saturday, and both Will and I discussed with him what we can do to increase participation in these events. Hopefully we’ll make some progress on that front in time for the next hackday.

Media

There’s some great photos from the event on Flickr, courtesy of Paul Clarke (Saturday and Sunday). I’ve pulled out some of the best of Team CompJ and added them here. All photos are released under a Creative Commons BY-NC 2.0 licence.

 

Elsewhere…

We got a lovely write-up about out project from Dyfrig Williams of the Good Practice Exchange at the Wales Audit Office. Dyfrig also curated a great storify of the weekend.

Hemavault labs have done a round up of the projects here

Quick and Dirty Twitter API in Python

QUICK DISCLAIMER: this is a quick and dirty solution to a problem, so may not represent best coding practice, and has absolutely no error checking or handling. Use with caution…

A recent project has needed me to scrape some data from Twitter. I considered using Tweepy, but as it was a project for the MSc in Computational Journalism, I thought it would be more interesting to write our own simple Twitter API wrapper in Python.

The code presented here will allow you to make any API request to Twitter that uses a GET request, so is really only useful for getting data from Twitter, not sending it to Twitter. It is also only for using with the REST API, not the streaming API, so if you’re looking for realtime monitoring, this is not the API wrapper you’re looking for. This API wrapper also uses a single user’s authentication (yours), so is not setup to allow other users to use Twitter through your application.

The first step is to get some access credentials from Twitter. Head over to https://apps.twitter.com/ and register a new application. Once the application is created, you’ll be able to access its details. Under ‘Keys and Access Tokens’ are four values we’re going to need for the API – the  Consumer Key and Consumer Secret, and the Access Token and Access Token Secret. Copy all four values into a new python file, and save it as ‘_credentials.py‘. The images below walk through the process. Also – don’t try and use the credentials from these images, this app has already been deleted so they won’t work!

Once we have the credentials, we can write some code to make some API requests!

First, we define a Twitter API object that will carry out our API requests. We need to store the API url, and some details to allow us to throttle our requests to Twitter to fit inside their rate limiting.

class Twitter_API:

 def __init__(self):

   # URL for accessing API
   scheme = "https://"
   api_url = "api.twitter.com"
   version = "1.1"

   self.api_base = scheme + api_url + "/" + version

   #
   # seconds between queries to each endpoint
   # queries in this project limited to 180 per 15 minutes
   query_interval = float(15 * 60)/(175)

   #
   # rate limiting timer
   self.__monitor = {'wait':query_interval,
     'earliest':None,
     'timer':None}

We add a rate limiting method that will make our API sleep if we are requesting things from Twitter too fast:

 #
 # rate_controller puts the thread to sleep 
 # if we're hitting the API too fast
 def __rate_controller(self, monitor_dict):

   # 
   # join the timer thread
   if monitor_dict['timer'] is not None:
   monitor_dict['timer'].join() 

   # sleep if necessary 
   while time.time() < monitor_dict['earliest']:
     time.sleep(monitor_dict['earliest'] - time.time())
 
   # work out then the next API call can be made
   earliest = time.time() + monitor_dict['wait']
   timer = threading.Timer( earliest-time.time(), lambda: None )
   monitor_dict['earliest'] = earliest
   monitor_dict['timer'] = timer
   monitor_dict['timer'].start()

The Twitter API requires us to supply authentication headers in the request. One of these headers is a signature, created by encoding details of the request. We can write a function that will take in all the details of the request (method, url, parameters) and create the signature:

 # 
 # make the signature for the API request
 def get_signature(self, method, url, params):
 
   # escape special characters in all parameter keys
   encoded_params = {}
   for k, v in params.items():
     encoded_k = urllib.parse.quote_plus(str(k))
     encoded_v = urllib.parse.quote_plus(str(v))
     encoded_params[encoded_k] = encoded_v 

   # sort the parameters alphabetically by key
   sorted_keys = sorted(encoded_params.keys())

   # create a string from the parameters
   signing_string = ""

   count = 0
   for key in sorted_keys:
     signing_string += key
     signing_string += "="
     signing_string += encoded_params[key]
     count += 1
     if count < len(sorted_keys):
       signing_string += "&"

   # construct the base string
   base_string = method.upper()
   base_string += "&"
   base_string += urllib.parse.quote_plus(url)
   base_string += "&"
   base_string += urllib.parse.quote_plus(signing_string)

   # construct the key
   signing_key = urllib.parse.quote_plus(client_secret) + "&" + urllib.parse.quote_plus(access_secret)

   # encrypt the base string with the key, and base64 encode the result
   hashed = hmac.new(signing_key.encode(), base_string.encode(), sha1)
   signature = base64.b64encode(hashed.digest())
   return signature.decode("utf-8")

Finally, we can write a method to actually make the API request:

 def query_get(self, endpoint, aspect, get_params={}):
 
   #
   # rate limiting
   self.__rate_controller(self.__monitor)

   # ensure we're dealing with strings as parameters
   str_param_data = {}
   for k, v in get_params.items():
     str_param_data[str(k)] = str(v)

   # construct the query url
   url = self.api_base + "/" + endpoint + "/" + aspect + ".json"
 
   # add the header parameters for authorisation
   header_parameters = {
     "oauth_consumer_key": client_id,
     "oauth_nonce": uuid.uuid4(),
     "oauth_signature_method": "HMAC-SHA1",
     "oauth_timestamp": time.time(),
     "oauth_token": access_token,
     "oauth_version": 1.0
   }

   # collect all the parameters together for creating the signature
   signing_parameters = {}
   for k, v in header_parameters.items():
     signing_parameters[k] = v
   for k, v in str_param_data.items():
     signing_parameters[k] = v

   # create the signature and add it to the header parameters
   header_parameters["oauth_signature"] = self.get_signature("GET", url, signing_parameters)

   # add the OAuth headers
   header_string = "OAuth "
   count = 0
   for k, v in header_parameters.items():
     header_string += urllib.parse.quote_plus(str(k))
     header_string += "=\""
     header_string += urllib.parse.quote_plus(str(v))
     header_string += "\""
     count += 1
     if count < 7:
       header_string += ", "

   headers = {
     "Authorization": header_string
   }

   # create the full url including parameters
   url = url + "?" + urllib.parse.urlencode(str_param_data)
   request = urllib.request.Request(url, headers=headers)

   # make the API request
   try:
     response = urllib.request.urlopen(request)
     except urllib.error.HTTPError as e:
     print(e)
   raise e
     except urllib.error.URLError as e:
     print(e)
     raise e

   # read the response and return the json
   raw_data = response.read().decode("utf-8")
   return json.loads(raw_data)

Putting this all together, we have a simple Python class that acts as an API wrapper for GET requests to the Twitter REST API, including the signing and authentication of those requests. Using it is as simple as:

ta = Twitter_API()

# retrieve tweets for a user
params = {
   "screen_name": "martinjc",
}

user_tweets = ta.query_get("statuses", "user_timeline", params)

As always, the full code is online on Github, in both my personal account and the account for the MSc Computational Journalism.

 

 

 

 

 

 

 

 

 

BoxUK “For the Social Good” Hackday

Yesterday I attended the “For the Social Good” hackday, organised by BoxUK and held at the Students Union at Cardiff University. As you may have gathered from the title, it was a hackday themed around creating apps that had some benefit to society. The event ran from 10am to 10pm, so it was a fairly short hack event compared to some, which had a big influence on what could be done during the time available. In total, given the time needed for introductions and getting started in the morning, then for presentations and judging at the end of the day there was only actually about nine and a half hours of coding time, so it was quite a tough day. I went along with Matt Williams and Mark Greenwood as our team and over the course of the day we managed to put together a fairly functional web app.

We took what we learnt from hosting and participating in the Foursquare hackathon and used that to guide what we did at this hackday. At the last event we had been slow to get going, took too long to form a team and decide upon an idea. We had aimed too high to begin with, and then also let feature creep affect what we wanted to build. Finally, our teams were far too big, meaning it was hard to divide labour up so that everyone was participating and had something to do. We finished with a working app at the end of the event, but we were determined not to make the same mistakes this time so we could get a more successful outcome. (Not that FourCrawl wasn’t a successful outcome of course, but more on that in the next couple of weeks.)

Going in to this hackday we solved the team size problem naturally, as there were only three of us able to attend which made forming a small team compulsory! To try and get a good start on the day of the event we met together on the Wednesday before to have a think about ideas and a bit of a brainstorming session. While we had a lot of ideas, most of them were too complex for a short day long hack event, so unfortunately we didn’t really come up with anything. We did have a vague idea of doing something related to cold weather though, to make whatever ‘it’ was topical, given that we’re heading into winter now. Fortunately at least one of us (not me) had obviously been thinking about it further after the meeting and on Saturday evening Matt sent round a message on G+ that he’d found a good data set we could use for an app that would be both easy to code within the time limit, and would hit the brief for the hackday pretty well. He’d found a dataset on data.gov.uk related to road accidents that included information on weather and road surface conditions that we could use to show people where to avoid in icy weather, or to give them an idea of previous incidents in their area. With an idea starting to form we went into Sunday in a pretty good state, and ready to start coding very quickly.

The day started well, everyone arrived pretty much on time, which made for a quick start. Our app idea solidified easily, and early on we realised that while we had historical data to display it would also be nice to have some current updates, so we added a social element to the idea, by scraping twitter for a certain hashtag so that people could report problems with icy roads in their area, along the lines of the ‘#uksnow’ hashtag that has been used frequently the last couple of winters. We were able to jump in quickly to start getting things built. Matt was in charge of getting the data into a database and building an api to query to retrieve it, using django to get a backend up and running quickly. I was in charge of building a front end map to display the data on, primarily working with HTML and javascript (and the Google Maps API obviously) and then Mark was in charge of getting the twitter scraping and postcode geo-locating working using python (so it could be slotted into the django backend), accessing the twitter API through tweepy.

Things went really well with the implementation, I spent a while cannibalising elements from previous sites to get a frontend up and running quickly (and also thank you bootstrap!) and by lunch we were pretty close to starting to consolidate things together. We set a target of starting to deploy by 6, so that by the time dinner was served Matt was starting to battle his server and its stubborn refusal to allow easy deployment of django apps. Things went so smoothly in fact that we were all finished and ready with almost an hour to spare, meaning we were the only team able to slope off for a self-congratulatory pint before the presentations and judging. We headed to the Taf (a place I hadn’t been in at least five years) for a swift one, then went back to the hackday for the final presentations.

Our presentation was first, and Matt demoed the app well while I played around on the laptop next to him. After us the rest of the teams went, and there were some really good apps shown off. My particular favourite was the ‘occupy where‘ app which allowed you to search for an ‘occupy’ camp or demo near a location, then presented information pulled in from multiple sources about that particular ‘occupy’ movement. A nice idea, well executed. Following the presentations the judges spent some time talking, then announced the winners. The surprise of the day was that they’d decided our app was the best and that we’d won! I was pretty shocked, but it was a very nice surprise. A group of first years from COMSC were runners up, and then a third year COMSC student (Tom Hancocks) won the best individual developer prize, so COMSC actually walked away with all the prizes! After a quick tear down of the venue it was time to head to the Taf for some celebratory drinks, a satisfying end to a really great day.

UPDATE 25/11/11:

There some excellent other write ups by various other people involved in the event below. I’ll add to this list as and when I find any more:

http://www.boxuk.com/news/box-uks-public-hacking-event-a-resounding-success

http://johngreenaway.co.uk/hack-day-for-the-social-good

http://marvelley.com/2011/11/24/for-the-social-good-box-uk-hackday-1-recap/

http://whitehandkerchief.co.uk/blog/?p=53

http://www.cs.cf.ac.uk/newsandevents/hackday.html

http://www.cardiff.ac.uk/news/articles/diwrnod-hacio-7933.html

http://www.boxuk.com/blog/the-box-uk-hack-day-a-review 

Foursquare Hackathon Post Mortem

Note: I wrote this post just after the hackathon, but then forgot to actually publish it! 

There’s a fair amount I learnt from taking part in and helping to organise the foursquare hackathon here at the university that probably deserves its own post. I’ll split it into two parts, the taking part and the organisation.

Organising a hack event

There were a couple of things I learnt from helping to organise the event that perhaps weren’t clear to begin with:

  1. Expect dropouts. A lot of them. We ended up with about 50% of the people that said they would attend actually attending. Which is fine because we hadn’t planned on providing catering etc. anyway, but if we had, we may have overspent massively.
  2. Get someone in to run the day that doesn’t actually want to code. All three of us organising the event really wanted to get involved and create stuff, so there was no-one left to organise the other essential things like food. Too often people would talk about ordering food, or going out for dinner and then get sucked back into programming and it would get forgotten about. By the end of the weekend we’d all eaten really badly, often very late at night. Having someone there to do things like order pizza or make coffee runs would probably have helped things go a lot smoother.
  3. If you’re going to be communicating with the world at large using twitter or something similar, make sure you communicate with each other within the organising team about posting updates. Often two people would try and reply to a query at the same time, which made us look a bit silly. Not a major issue, but it might help give a more professional feel, event if you’re total amateurs.

Participating in a hack event

We also learnt a lot about participating in a hack event. My top tips for a successful project would be:

  1. Do your research. If you can, go into the event with an idea already, just in case no-one else has any. It’ll help you get started quicker.
  2. Keep it small. Don’t over reach. 24/48 hours is not a long time to code something, and by making it too complicated you’ll be disappointed with the outcome. Simple is best. Add extra features later if you have time.
  3. Small, agile teams. These projects have to be small because of the time constraints, so if teams get too big there won’t be enough for everyone to do. People will end up feeling useless or left out, which is never good. I would say a maximum of 4 people per team.
  4. If you want to learn something new, a hack event can be a great place to be forced to up your skills in a particular area very quickly. It can also be a great place to learn new skills and tools from others. However, this may lead to the end result not being ideal. If you can live with that, great, you’ll walk away happy. I personally upped my javascript knowledge over the weekend from approximately 0 to something approaching a passable working knowledge, which was great.
  5. Know your tools. This is pretty much common sense, but if you’re not after learning something new and you just want a successful app/outcome, pick and use the tools you know really well. We went with django on the back end because myself and Matt know it pretty well, and we were able to get that part of the site running really really quickly. Had we gone with something else it may not have worked so well.
  6. Get coding. Screw the design, you can worry about that later. Start coding early, and code fast.
So thats the things I learnt from the weekend. Hopefully we’ll be able to use this knowledge again in the future, both organising and participating in future events.

Scraping data from MyFitnessPal with python

Following my success with extracting my Google Search History in a simple manner, I’ve decided that I should do something similar to extract all the data I’ve been feeding into myfitnesspal for the last 5 months. As I briefly mentioned in the review of the app + website, the progress graphs leave a lot to be desired and there’s very little in the way of analysis of the data. I have a lot of questions about my progress and there is no easy way to answer all of them using just the website. For instance, what is my average sugar intake? Is this more or less than my target intake? How does my weekend nutrition compare to my weekday nutrition? How much beer have I drunk since starting to log all my food?

Unfortunately there isn’t an API for the website yet, so I’m going to need to resort to screen scraping to extract it all. This should be pretty easy using the BeautifulSoup python library, but first I need to get access to the data. My food diary isn’t public, so I need to be logged in to get access to it. This means my scraping script needs to pretend to be a web browser and log me in to the website in order to access the pages I need.

I initially toyed with the idea of reading cookies from the web browser sqlite cookie database, but this is overly complex. It’s actually much easier just using python to do the login as a POST request and to store any cookies received back from that. Fortunately I’m not the first person to try and do this, so there’s plenty of examples on StackOverflow of how to do it. I’ll post my own solution once it’s done.