I’ve never been much of a charity person, I’ve always found it hard to donate to a cause but recently I found a cause that worked. For me, it wasn’t an international crisis, it was a local problem and it wasn’t a donation of money.

My children’s pre-school had a website, the person that made the website was looking to move on from the responsibility 5 years later. The pre-school wanted to make some changes to the info on the pages and didn’t know how. When I took a look at it, it was obvious why. The site was hand-coded HTML, and every time a phone number or anything needed changing they had to call the IT person. And that person allegedly ran a website company. Shocker.

Well that just was not going to continue. Not on my watch.

So that was where I made my charity donation. It was all about using what I knew effectively. For me, the effort to set up hosting and installing the mighty WordPress was all a tiny amount of effort. It was effortless to immediately do a risk analysis of security and put in change control practices, to start doing backups and to choose a software package that permanently solved their problem, and solved it so well that it transformed what they could do.

However, I had to exert myself to not do too much. Interesting how keenly you feel the support work when you have to do it at 1am and that isn’t even your real job. But. Also, how sweet the saccharine hit of solving a problem that is totally out of reach, and solving it for an organisation that is caring for your child. They have no time to sit down and solve an IT problem and every pound they spend on a support contract with an IT firm is a pound they don’t have to spend on another teaching assistant.

Of course, it spiralled a bit. Soon they had some email problems, and laptop problems and needed to change the domain name and they kept me on the hook. Occasionally it felt too much, but that’s their job squeezing the lemon. And they have come to rely on me for a few things. But then, out of the blue, someone on the governing committee just said “I’m going into bat for Team Sam, I’m getting someone else to do it.” And they did.

I got to experience the final stage of the charity project: the hand off to the next volunteer. I even wrote the documentation! It wasn’t The Cathedral and the Bazaar (point 5), but I have to say I felt a tiny bit of pride that I’d applied global tech firm, auditable standards to this tiny, tiny IT problem and I could just hand it on knowing that things were in great shape.

For me, donating to people who needed help was great, but I also kept myself sharp. I didn’t use my normal technologies, it was a different problem and had to be running without support for years as “good enough” rather than our own multi-million pound efforts to get to 1000% note-perfect auditable polish. That was a great experience. I may have emitted the occasional curse while on the journey of course…

(this is an older post recycled from 2010)


Changing fonts in R

I’m trying to make R charts that meet my company’s brand guidelines.

I’m trying to get different fonts into R,  you need to load them into R separately as it can’t just see the windows fonts. But I suffered so you don’t have to.

I had a couple of options for the package that I used, and I used this trick to compare the packages. There’s a useful package called cranlogs that allows you to see how much they get downloaded, which is an indicator of their support level and freshness! Here i compare the showtext and extra font packages.

cranlogs::cran_downloads(when = "last-month", 
                         packages = c("showtext", "extrafont"))  %>%
  as_tibble() %>% 
  ggplot(aes(x = date, y = count, color = package)) + 

I got more help on extrafont here.

Next: onto the colour wheel and loading the brand colours into the colour palette!






# have to run this once only per system
# takes several minutes to import them all
# after that it loads all the fonts when you 
# load the package
# if you uninstall the package, you have to do 
# it again...
if(! "Gotham Office" %in% fonts() ){
  print("You haven't imported the fonts into R")
  print("This will take a few minutes")
  print("You only need run this step once!")
  loadfonts(device = "win")

cheeseOrFont <- function (font_family){
  ggplot(mtcars, aes(x=wt, y=mpg)) + 
    geom_point() +
    ggtitle("Fuel Efficiency of 32 Cars") +
    xlab("Weight (x1000 lb)") + 
    ylab("Miles per Gallon") +
    ggtitle("NOW YOU CAN SEE that the font is changed") +
    theme(text=element_text(size=16, family=font_family)) 

cheeseOrFont("Gotham Office")
cheeseOrFont("Comic Sans MS")

How fast can you form a team?

When we look for excellence, looking at extreme examples can help to change expectations. When building a team, we expect that that it takes time to move through stages until the team becomes “performing”. We expect that to take time; many weeks to months.

What about forming a team in an hour? Can that be done? A team that collaborates, with individual contributors making their voice heard, each person bringing their own skills, each person working outside of their comfort zone? Is that possible in an hour?

My experience of jury service showed me that it is possible, but: can we replicate that in our own teams? What are the causes? What are the limitations?

Juries in the UK are drawn from the adult population; anyone on the voting register can be selected. That means that the jury is diverse. In our case we had a couple of people in their 20s, and up to people in their late 60s. We had, men and women; professionals and non-professionals. Retired accountants who were school governors, carpenters who were part time firefighters, grandmothers who stack shelves in supermarkets, teachers, project managers, sick and healthy, old and young. So you could argue we had no useful skills; but really on a jury the only expertise is life. The more diverse your life experience the better you are at seeing in the motivations and excuses that are the bread and butter of a court case. You might argue that experts in justice would be better, but the fresh approach that a new jury brings means that trials are heard fairly; if you spent all your time in court cases where half the people were lying, you quickly slip into cynicism. Also, evidence in court cases is rarely complete; you have points of light from physical evidence but even that can be interpreted. The more diverse a team is, the more angles they can examine from. Diversity is strength.

Key point: Juries are fresh and enthusiastic. Juries are made of diverse individuals.


One thing that helps juries bond is the seriousness of the situation. From the moment you step into the court you are under no illusion that these people are not messing around: the silence, formal etiquette and formal dress code all reinforce that. The judge leads the court, both the jury and barristers. The instructions given to jurors are unflinching in their seriousness: that we are expected and required by law to make a good job of this. We take no breaks except when given leave by the judge. There are legal penalties for failing to attend. However, the judge also gives relief to juries; they need not know the law, they need not concern themselves with anything but the evidence; and in the evidence they have free rein to interpret it anyway they like. Even if the barristers or the judge give advice, in matters of evidence, the jury may ignore it. You’d think that there would be little room for interpretation of evidence and the law, but no case gets to court unless it is complicated and subtle. If there was clear physical evidence then the case would not be heard at trial. For a jury, there are no facts only interpretations.

However, even though juries need not know the law, the evidence is a heavy burden. Outside we might joke about how “they are all guilty”, but in the court you never joke like that. There is no doubt there is a moral duty, to the victim and the accused. We are asked to aspire to the highest standards; way above what we follow in our day to day lives.

Key point: the judge gives clear and unambiguous executive leadership; setting standards and burdens way above our expectations, then asking that we rise to them.


But how do juries form into a team? Especially one that has no time to form, and little common ground outside the court setting. We aren’t allowed to go out together and discuss the case (in fact we aren’t allowed to discuss the case at all until evidence is finished). Then we aren’t allowed to do anything else other than discuss the case. We can make only small talk until the discussions start, then we have to go into a room and only come out to take bathroom breaks and to go home if there is no decision. We all had to go for cigarettes together, even those who didn’t smoke. There can be no discussion that is not heard by all, and no discussion that can be heard by anyone who isn’t on the jury.

In my experience – and I can’t say too much about what we discussed –  the high standards we were set encouraged us to  all respect each other, and allow everyone time to speak. We quickly created a minimal process for round table discussion and everyone had something to say. In juries, all are created equal; education, race, age, creed all are relevant but none grants superior point of view. All are equal, everyone has the right to speak, none has the right to compel anyone. As discussion progressed, some people brought their talents to bear; technical skill helping others examine CCTV evidence, patience in balancing argument, being willing to lead a discussion.. Or stop one. Creating mechanisms to progress discussions and drive consensus.

Key point: clear rules of equality are established, then let the team run itself.


The output of a jury is clear, and is commanded clearly by the judge: first prize is a unanimous decision. If agreement can’t be reached, then a majority decision of 10:2 is sufficient. We all know what we are trying to achieve and all of us an equal stake in that. For my own part, we worked together bringing our own viewpoints but my verdict was my own. It was painful, but reaching my own decision was accompanied with a sense of peace and achievement that contrasted sharply with the uncertainty and weight of the process.

Yes, of course there was disagreement. Probably the worst of which was – I’m not proud to say – caused by me. What’s the only thing you can do wrong on a team trying to achieve a consensus? You can try and block or influence someone else with the force of your personality rather than the quality of your argument. I overstepped the line and was put in my place, immediately. I wonder if we would have been able to manage individual underperformers or lack of engagement; the matter didn’t arise the situation and the case so serious.

When we reached our decision, many people tried to express their feelings on being on the jury; it wasn’t one of enjoyment as such. Put a powerful sense of fulfilment; high standards had been reached and we’d really done something and done it together. Mostly people spoke about how proud they were.

Key point: when expectations are above and beyond, people don’t want to be the one that lets team down.


So why don’t all teams form this way? For starters, this environment of genuine importance can’t be faked. People take it seriously because it’s serious. Young lives are being ruined on both sides of crime and that gives depth to the experience. Other careers may experience similar moments of crisis, but crisis can’t be faked. The intensity can’t be maintained in a work setting; not everyone is dedicated like that to their career and the imbalance would quickly create tension that would not be resolved. The unity of purpose of a short sprint meant that the diverse team could hold together for that period but it would not be sustainable.

The following week, I was on a second trial. It wasn’t jerry springer but compared to the violence and chaos of the first case it was positively recreational. That allowed us to step down gently, and return that amped up feeling of importance to normal.

Key point: put teams under pressure to achieve exceptional results; but don’t try and maintain those sprints. It can’t be done. Allow teams time to unwind and process the experience they’ve been through.

RStudio nearly has python, d3 and others

I’ve been playing around this week with python and d3 in vNext of Rstudio. The next version will be 1.2 and is intended to be out later this year.

The upgrade didn’t go that well.

Firstly I had to patch the drivers for my laptop’s graphics card or it didn’t work (actually, the screen was white when i connected my multiple monitors…)

Then I had a whole other set of problems with my python installation, which basically came down to “put it on the path, at the top”.

Then we had some fighting between Miktex (for PDF rendering), Anaconda and something else. Which means I have to go back to Rstudio 1.1 for my real work.

However, it was cool to see R and python together in one document, and able to access the same data. And d3.. well I think that’s going to be useful for the more interactive style graphics.

Gitting better at git

I use git for my work with RStudio but only in a very crude “click this and then click that” way. You know the ritual: stage-commit-pull-push and pray that nothing goes awry.

Of course, anyone wise knows that you have to know git on the command line. Not to be a guru, but to be effective.

I’ve been working with this book today: and it’s excellent. I bought my own copy.

After about an hour with the book, I checked in from the command line from my embedded powershell terminal inside RStudio, and didn’t I feel like a grown up?

What’s good about it?

  • lean and brief; well written
  • gets you working within 5 minutes
  • dispels fear by repeated practice

Caveat: It’s deeper than my knowledge, so I can’t tell if it contains the deepest advice. Also – for a windows dev – I’ve been on the command line a fair bit.

I’m recommending this book to everyone who goes near git.

Waiter! Conference for 1, please!

So, I heard from a few people that they’d been to some conferences, and I realised that I’ve not been to any kind of external training for more than 5 years. I’ve been lucky enough to change job and need to learn some new stuff on-the-job, but no conferences.

So I decided to do a staycation-style conference and do it all at my desk. 

I asked my manager and got agreement that this stood in for out-of-office conferences. I wanted to get that “saturation” effect that you get from a conference where you spend all your time thinking about new ideas, so that you are actually working on those new ideas even when you aren’t watching the content. 

Below are notes on the talks etc. I’ve put my favourite talks close to the top, but YMMV.

What I did:

  • looked for chunks of content which is quite recent rather than just watching random youtubes
  • spend as much of a day as I could doing it, to saturate my mind, which was more tricky than I thought.
  • I paused the videos occasionally to go off and google things, make notes etc.
  • sometimes I felt that I needed to know more to get the best out of sessions, so did some mini vids/reading before
  • watched most of the videos at 1.5x or 2x speed
    • this makes you a bit crazy after a while, and normal speed speech seems veeerrrrrrryyyyyy slloooooowww with huge pauses

Motherlodes of content

Main topics

Main topics that I got into, I colour code below

  • R
  • R + XXXX; where XXXX is a data science tech like Tensorflow, Spark etc.
  • bringing R to an organization (what I learned here is we are following a classic path…)
  • Nu-architecture
  • Docker / Kubernetes
  • Observability / devops++
  • Continuous deployment / release

No, I did not do any blockchain talks.

What I would do/differently

Overall, I think it worked. I would do it again but “turn it up to 11” and block off even more time. I also didn’t do any Qcon talks, as the 2018 ones weren’t published yet. That’s a foolish thing to block me, I know.

My notes

I’m Pwned. You’re Pwned. We’re All Pwned

  • has 320 million compromised passwords.
  • Shodan: google for IoT contains many unlocked devices

Pros: fun overview of internet security

Cons: not much implementable information

Building a Raspberry Pi Kubernetes Cluster and running .NET Core

Great, just for dizzying, vertigo-inducing stack of technologies:

Compiling a serverless function, in a docker build, on a windows machine, to target ARM processor on a linux machine, so it can be pushed to a kubernetes cluster running an openFaas serverless function.

Pro: really fun talk

Cons: not enough about real work, hobbyist and educational, which isn’t my business

Machine Learning with R and TensorFlow (Rstudio Conf)

Pros: Great overview of tensorflow and R, great links onward for more info

cons: if you already know tensorflow it is less exciting

GitOps – Using Git as your source of truth for build, deploy and observability

  • Trying to encode infra as declarative config rather than imperative “do this, do that” scripts that build a server
  • … then source control that config
  • Then building on docker and kubernetes to implement: compare reality to the config in real time and fix it 
  • Essentially instead of deployment being a push from a build, it’s a pull from the production system
  • Include the monitoring, security etc. in the config.
  • Describes the world of controls audits as being “full of 3rd party tools that don’t do half the things they say… it’s a world filled with psychopathic bullshit”
  • “a system that is observable should also be controllable”
  • Keeping production secrets in source control, how to keep that safe (see also
  • Monitoring is for the key metrics that you already know are important and that you need to maintain in a quick overview, observability is for everything else, particularly investigating problems.
  • With a complex distributed system you should design some observable criterion about the system before you make the config  change in production.. You can’t “test” these changes because you don’t really have a test system that responds in the same way as production.

Pros: good end to end talk on kubernetes / continuous deploy, good alternate view of production controls

cons: very far from where most people are

Hadley Wickham: Managing many models with R

Hadley talks about the gapminder dataset, and how to do many models at the same time. How to use the purrr package to do that.

Pros: good talk on modelling in R, very good quick summary of how to use lm and purrr

Cons: gapminder dataset is a bit distracting.


Testing in production

  • Want to be able to deploy a change to production within 5 minutes.
  • Increasing speed and accepting increasing risks
  • Fast rollback
  • Testing in production, deploying from trunk all the time means a rigorous way of making a change that is small enough to commit to master/trunk but not broken
  • And other things about using feature switches to do “dark launches”
  • Being able to see deployments on the monitoring
  • Casually stated that “of course you can’t do this with things that make payments”.. But not obvious why you can’t; after all we’ve already done the testing that it “works on my machine” so is logically correct. Maybe there’s concern that you can’t mock out a payments engine in production, or maybe that doesn’t differentially improve testing quality.
  • Other elements: mob programming
  • Monitoring driven development: for small changes in performance
  • 15 pairs each deploy about twice a day

Pros: quite similar to what we already do, so lots to like, explains how to be awesome at incremental improvements on existing functionality

Cons: confirmation bias, doesn’t offer a lot that is really really new, testers won’t like it, doesn’t explain how to scale this up to do breaking changes other than using feature toggles


Observability: it’s not just an ops thing

  • Not about seeing that “on average” most queries are completing in 5s, no one cares about the average they want to know why their query isn’t working
  • Exploring data: we want sub-second response for 95th percentile, we don’t want to break someone’s flow while they are investigating
  • More on feature flags, and deploy before you release.. And then adding this feature flag to the observability data
  • Monitoring driven development, where you make
  • Using sampling as a way of keeping a long history without keeping all the data.

Pros: exciting talk about really hard problems, advocation the close dev/ops working relationship

Cons: slightly chaotic delivery, questionable direct relevance to anyone less than facebook scale, no one asked her how they made the transition or if they were born on that side of the world of complexity

What is programming anyway?

  • Discussion of how we can teach programming to non-programmers, including children
  • Is programing like natural language? Or more like maths?
  • Metaphors matter because the more sure that people are that ability is innate rather than trained, the less women participate in it
  • The language metaphor helps because everyone can do it but only after practice, and you need to maintain that practice.

Pros: interesting if you want to broaden the appeal of coding (i.e., get people doing data science!), create diverse programming jobs, make people believe that code is the solution

Cons: slow to start, not really about work

Manning: Docker in Motion

Pros: solid motivation and intro to docker

cons: free content ends before we learn enough, but maybe the full course is great


Docker in 5 minutes

Pros: gives a bit of the history, very fast, fun

cons: old

The children’s illustrated guide to Kubernetes:

very short, introduces words.


Introduction to microservices, Docker and Kubernetes

  • not a conference talk, home rolled one
  • A demo of getting a docker container running, and then sending it into kubernetes. 
  • Start at the demo point and watch at 2x. 🙂

Pro: decent demo, all the deets

Con: slow to start, irrelevant attempt at explaining microservices, not better than other explanations, books, etc.


NDC: Identity server for ASP.NET core 2

5 verbs of authentication: SignIn, SignOut, Forbid, Authenticate(take a credential, turn it into a claims principal), Challenge

Pros: detail on new features of identity server and how it works with authentication providers

Cons: you need to know how IDS works and integrates into everything, hard to get excited about if you aren’t deeply familiar with ASP.NET core v1

Kubernetes for sysadmins

  • Allowing kubernetes to mount a filesystem that is raw and not on the host machine or the node… so you can detach the running process and re-attach another one.. Would that really work for a database?
  • But of course that assumes that the storage is fault tolerant
  • Actually a pretty good demo of a scaled out web app running in kubernetes

Pros: good speaker, one of top faces of kubernetes, good demo of bringing up an app in kubernetes

cons: linux focus

Sports data viz in R

  • Suggest using ggvis in shiny when plotting large datasets because of render time

Pros: good introuction to the different options d3, plotly

Cons: other than looking at the comparison to js to R, not much more

Large scale machine learning

  • Showing rstudio on the google cloud ML demo
  • And deep learning on the GCML and how you train for lots of models on that.

Pro: short, nice demos

Con: not implementable for us

Deploying tensorflow models

  • About turning tensorflow network models into services
  • Which you can do with an r package.
  • Or you can deloy the model with rstudio connect on prem
  • Or you can encode the keras model into javascript and run it standalone in a web page

Pros: strong demo

Cons: relevance for us

Building spark ML pipelines with Sparklyr

Pros: strong on demo, plenty of example code.

Cons: short on motivation, doesn’t say why Spark.


Language acquisition in Minecraft with reinforcement learning


Pros: totally different talk, totally different learning method, interesting minecraft links!

Cons: talk isn’t great.


Push button publishing in rstudio connect

  • Some interestint thoughts about using R as a first class member of the overall corporate dev ecosystem, and stages you might go through up until that point.
  • Fantastic sales pitch on R studio connect

Pros: short, good demos

Cons: doesn’t admit our developer-centric controls model


Parameterized R markdown

Practical demonstration of how to do this R markdown, and use on Rstudio connect server

Pros: very practical report for R programmers, short, just a few minutes

Cons: maybe you knew already from reading about Rmarkdown

Drilldown data discovery with Shiny

Pro: nice demo of an interactive shiny app, also nice link of a R analysis to a google docs data set.

Con: quite specific about the UI stuff, maybe unsuitable for an org that has full-time developers.


The R admin is RAD

Pros: good ambition on introducing R, great demo on shiny

Cons: asks you to have faith that it’s good, no concrete answers obviously.


R panel discussion

  • How to scale up data science team and embed R
  • Interesting comments around not worrying about how to productionize and change control these data science efforts, that the people doing data science need to be able to do it freely without worrying about that *yet* if they have to worry about it then they won’t create
  • “..crazy things are going to happen, people are going to take a million by a million matrix and multiply it by another million by a million matrix…”
  • “we value innovation more than stability”
  • Preventing people from getting attached to a physical environemnt, like a weak variant of chaos monkey, where you move onto new servers to enforce independence from infrastructure.
  • “Scientific debt” for firms
  • Validating open source tools
  • “don’t confuse change management with transition management, change management is about ensuring people have new tools and have access to those tools and skills provisioning hardware, transition management is this hard thing where  you are changing people’s identity they were previously an expert in the thing that they did and now they are going to have to be new at this thing. And in their minds they were an expert in this thing and they were this person… and identity and people and their feelings.”


2nd R panel

  • Tidyverse discussion
  • biggest insight is that there’s an effort to get stats models implemented in a tidy way in a paid-for effort
  • No real merits over and above following these people on twitter. Sorry.


Introduction to microservices, Docker and Kubernetes

  • not a conference talk, home rolled one
  • A demo of getting a docker container running, and then sending it into kubernetes. 
  • Start at the demo point and watch at 2x. 🙂

Pro: decent demo, all the deets

Con: slow to start, irrelevant attempt at explaining microservices, not better than other explanations, books, etc.


NDC: Implementing Authorization for Applications & APIs

Demo-based talk of a sister-project of identity server for managing the authorization policies that will result from modelling real business process.

Pros: practical examples, relevant for use as we use identity server

Cons: only interesting if you plan to use policy server


Compositional UIs – the Microservices Last Mile

  • Good blast through “what is a microservice”
  • What that looks like in real life in corporations

Pros: he’s a good speaker, interesting topic for architects, big problems that need big thoughts.

Cons: a bit slow to start, took 20 mins to get to conway’s law, big things that only apply to big teams writing conjoined web apps facing large numbers of users.. Big epic problems that he then proposes some actual code to solve, when he’s talking about problems of organisational dysfunction.


Deploying Windows Container based apps using Kubernetes

    • Interesting side point: windows, linux, arm devices.. Now have one workflow for all these
    • Sql server installs in a container.
    • Dev environments could be in a container, same as the staging/prod environments
    • Grafana? Vs splunk

    Pros: great coverage of docker, good introduction, decent steps towards kubernetes and using that in a mixed mode windows and linux environment

    Cons; lots of chat, no demos.


    Hack your career

    I summarise: get a blog, get github, get twitter, do some work in public, get known for being a blogger and speak at meetups and conferences until you get made redundant then turn that window of money with no work to achieve success then {repeat as needed…} you can work from home and live in a mansion.

    Pros: inspirational, living the tech dream, some realistic messages

    Cons: lacks enough specifics to be really useful, doesn’t make enough of the sacrifices and the compromises that surely would be needed


Diverse roles in tech will lead to diversity…?

Welcome to international women’s day.

It’s not controversial to say that technology jobs aren’t filled by men and women equally. I think that there is an opportunity for more diverse working methods and more diverse thinking producing better results. I can’t prove this, I think it is true because most of my business-facing development work would benefit from multiple points of view.

Great; so how would we attract diverse talent… And more importantly, how do we ensure that we get the benefits of that diversity? Because – in my opinion – acquiring diverse talents and then forcing them into the same suits and ties, the same modes of interaction is asking for a disappointment.

So how do we allow diverse talent chances to flourish in technology. There’s only one way to write code, right?

Well, I think that R and other data analysis languages are a chance to create some diverse roles that not only are filled by diverse backgrounds… but get woven into current roles. That type of programming is accessible because it stays close to what you already know and doesn’t require you to change tribe. So it’s not like you need to be an ex-stats PhD to use it, you don’t need to be an ex-developer…  You don’t need to be an ex-anything, you can be an accountant who uses it now, you can be a business analyst who mangles HR systems or anything else, all you need is to be motivated and believe that you can.

That accessibility actually carries through into the language, the way you use it to solve problems, the kind of problems you want to solve with it, and the kind of outputs you get (rich data, models, charts, diagrams, etc.). All diverse, all open and all accessible and attractive to newcomers.

I think that this has happened before; home computers democratised computers and took them out of university maths departments; web tech put design right in the middle of the developer job with coding tech like HTML, CSS and even javascript. I think that a web design shop has now many diverse roles from plain coder, to hybrid design/coders and visual designers all working together. That just didn’t before 1995. Probably games have moved the same way; though you could challenge me on the demographics there.

So, it might not be directly an issue for women, but it could be a factor. If you look at the number of #RLadies out there..

And now for an R pun: I’m think that we should do something with suff-R-gette?