Keeping an Engineering Notebook

The best upgrade I’ve made to my workflow in the past year was to start keeping an engineering notebook. Whenever I start a new project, the first thing I do is create a new section in my engineering notebook. It’s really simple. With a tiny script, I generate a dedicated folder for the new project, plus three sub-folders and a README: some_new_work |__ notes/ |__ data/ |__ scripts/ |__ README.md As simple as this structure seems, it has had a tremendous impact on my work. In this post, I want to try to unpack why and how I think that’s working. ...

January 28, 2018 · 4 min · Dan Kleiman

More Efficient Solutions to the Top N per Group Problem

In my last post, I tried to tackle getting the Top N Results per Group from a BigQuery dataset. When I tweeted out the post, I got some great feedback and suggestions for more efficient ways to get the same results, so in this post I want to try to understand why the alternatives are more efficient. ...

November 7, 2017 · 5 min · Dan Kleiman

Top N Per Group in BigQuery

EDIT: After I posted this initially, I got some great feedback, so I wrote a follow-up post here. In this post, we are going to explore a strategy for collecting the Top N results per Group over a mixed dataset, all in a single query. I stumbled onto this solution the other day, mostly driven by the fear that I was re-scanning my BigQuery data too often. At the time, the only way I knew how to look at a Top 10 list of a subset of the data was to add a WHERE clause limiting the whole data set to a single group and combine with ORDER BY and LIMIT clauses. For each group, I would just modify the WHERE clause, rescan all the data, and get new results. I thought there had to be an easier way to get the same ordered subset for any particular group in the data, all at once. It turns out, there is a much more efficient way to solve this problem. ...

October 30, 2017 · 5 min · Dan Kleiman

Don't Blow Your BigQuery Budget on Unknown Data!

It’s easy to blow your BigQuery budget when you are exploring a new data set. Because you’re billed for the amount of data scanned, not the ultimate result set, when you don’t know what you’re looking for, you can end up with wasteful queries. In this post, I’m going to share some tips for more efficiently scanning data in BigQuery when you don’t quite know what you need. ...

October 6, 2017 · 5 min · Dan Kleiman

GoBridge with Bill Kennedy

Last weekend, I had the chance to volunteer at a GoBridge event taught by Bill Kennedy of Ardan Labs. I’m trying to make 2017 my year of learning Go, so helping out at the event felt like a natural extension and a great way to connect with more people in the Go community. Going in with Ruby as my first language, I braced myself for static typing and wanted concurrent programming to bend my brain, but that’s not really what happened at all. ...

February 12, 2017 · 8 min · Dan Kleiman

My 5 Strategies for Learning Go in 2017

Over the past couple of years, one thing I’ve become more and more aware of is the unease and uncertainty of diving into a new project. Not matter what the new X is, I find I always go through the same set of uncomfortable feelings on my initial approach. Now, though, I’m starting to become familiar enough with this process that – even though the discomfort doesn’t go away in the initial learning stages – I can embrace it, coexist with it, and forge ahead in learning, because of the strategies I’m going to lay out here. ...

December 29, 2016 · 6 min · Dan Kleiman

Rails Security Exercises from Bearclaw

In this post, I’m going to tell you what I learned doing a series of Rails security exercises developed by Bearclaw, a Rails security consultancy. Before I go into the exercises, though, I want to send a huge thank you to Ali Najaf, founder of Bearclaw. What I’ve learned here is due to the thoughtfulness of the exercises he’s put together and his willingness to try something new by sharing them with me. Normally these exercises are part of a workshop he leads in person. ...

September 24, 2016 · 9 min · Dan Kleiman

Nice Try, NilClass

I love that feeling when a new concept starts to come together in your mind and you can point to all the converging sources of insight. Right now, I can’t tell if I’m fooling myself, hiding some logic, or making my code more readable with this particular concept, but when I put together these three pieces of information, I think I start to see something emerge. I’ve been inspired to do some more digging into these kinds of questions lately thanks to the awesome new Ruby Book Club Podcast. Co-hosts Nadia Odunayo and Saron Yitbarek are leading us chapter-by-chapter through different Ruby books and sharing their thoughts on the podcast as they go. ...

April 10, 2016 · 4 min · Dan Kleiman

Migrating Posts and Pages from Wordpress to Jekyll

This is Part 1 in a series on Migrating from Wordpress to Jekyll. The documentation for getting started with Jekyll is great. I’m not going to rehash everything that’s covered there. Instead, this post and the others in the series will be more like, “here’s the order I wish I had done things in” or “here’s everything I ended up needing to pull together to get stuff working”. I hope it helps you and saves you time if you ever decide to do a similar migration from a self-hosted Wordpress install to Jekyll. So here we go…. ...

March 11, 2016 · 5 min · Dan Kleiman

Migrating from Wordpress to Jekyll

So, I’ve decided to migrate my Tai Chi site at dankleiman.com from Wordpress to a new static site using Jekyll. Since I haven’t posted there in almost two years, but I get a steady stream of new subscribers who want to learn about Tai Chi, qigong, and meditation, I thought it would be good to give the 300+ pages and posts a more evergreen feel. ...

March 9, 2016 · 1 min · Dan Kleiman