Delve is a debugger for the Go programming language. The goal of the project is to provide a simple, full featured debugging tool for Go.

If we run our go service using a Makefile, with a command like make run, it can hard to find where to hook in and call dlv debug. We can get around this issue by attaching the delve debugger to our running service instead. First set a breakpoint in the code, on the code path you intend to trigger by adding the statement runtime.Breakpoint(). Don’t forget to import the runtime package.

Scoping in Go is built around the notion of code blocks. You can find several good explanations of how variable scoping work in Go on Google. I’d like to highlight one slightly unintuitive consequence of Go’s block scoping if you’re used to a language like Python, keeping in mind, this example does not break with Go’s notion of block scoping:

Let’s start with a common pattern in Python:

class Data(object):

    def __init__(self, val):
        self.val = val

    def __repr__(self):
        return('Data({})'.format(self.val))

li = [Data(2), Data(3), Data(5)]

print(li)

for d in li:
    d.val += 1

print(li)

Output:

The use of context in Go can help you pass metadata through your program with helpful, related information about a call. Let’s build an example where we set a context key, “stack”, which keeps a history of the function names called over the lifetime of the context. As we pass the context object through a few layers of functions, we’ll append the name of the function to the value of the context key "stack".

Go uses goroutines to execute multiple bits of code at the same time. Channels allow for the aggregation of the results of these concurrent calls after they have finished.

Consider a case where we want to make several GET requests to a server. The server takes some time to process each request, in many cases can handle many simultaneous connections. In a language like Python, we might do the following to make several requests:

Go closures

Say we need a map to store various versions of a configuration in Go. Here is a simple example of the structure:

envs := map[string]string{
    "dev":  "1",
    "prod": "2",
}

Given this config map, we need to create an additional map that uses the same strings as the keys, but has functions for values. The catch is that the body of each function needs to make use of the value from its corresponding key. For example, functions["prod"] should have a value of type func() string and the body of that function should make use of the value, envs["prod"]. Here’s a concrete example:

Markdown is useful tool – these blog posts are written in it. I like Markdown because once you learn it, it feels invisible. It is minimal and intuitive. However, sometimes you need it to do things a little differently.

I ran into an issue where I had content which had to be written in only Markdown (no HTML) and later needed to be rendered as HTML and inserted onto a webpage, but I needed to add attributes to the HTML tags that were generated. The content itself needed to look like this:

supervisor is a UNIX utility to managing and respawning long running Python processes to ensure they are always running. Or according to its website:

Supervisor is a client/server system that allows its users to monitor and control a number of processes on UNIX-like operating systems.

Installation

supervisor can be installed with pip

$ pip install supervisor

Given a script test_proc.py, start the process under supervisor as

$ sudo supervisorctl start test_proc

Now it will run forever and you can see the process running with

Querying S3 with Presto

This post assumes you have an AWS account and a Presto instance (standalone or cluster) running. We’ll use the Presto CLI to run the queries against the Yelp dataset. The dataset is a JSON dump of a subset of Yelp’s data for businesses, reviews, checkins, users and tips.

Configure Hive metastore

Configure the Hive metastore to point at our data in S3. We are using the docker container inmobi/docker-hive

Creating a Presto Cluster

I first came across Presto when researching data virtualization - the idea that all of your data can be integrated regardless of its format or storage location. One can use scripts or periodic jobs to mashup data or create regular reports from several independent sources. However, these methods don’t scale well, especially when the queries change frequently or the data is ingested in realtime. Presto allows one to query a variety of data sources using SQL and presents the data in a standard table format, where it can be manipulated and JOINed like traditional relational data.