Manipulating JSON using jq


2017-08-07 · 2 min read

jq is a command-line tool to transform (filter, slice, map, etc.) JSON files. It treats JSON data as streams similar to AWK which treats text files as streams.

Let's use data from GitHub as our playground. We will start by fetching a list of repositories as JSON. Feel free to replace zaiste username with any other handle. We will store the result as zaiste.json for convenience. You can also pipe the response directly to jq.

http https://api.github.com/users/zaiste/repos > zaiste.json

GitHub API returns up to 30 entries for each request. If you need more, you need paginate by following the Link response header, e.g. https://api.github.com/users/zaiste/repos?page=1 for the next page of my public repositories.

Let's verify we got 30 repositories.

jq length zaiste.json

Let's reduce that list to only source repositories, i.e. excluding forks.

jq 'map(select(.fork == false)) | length' zaiste.json

map(select(.fork == false)) is equivalent to [.[] | select(.fork == false))]. If select argument expression returns true for the current element, it returns that element unmodified. If its argument returns false, select outputs nothing.

Each repository has many fields. Let's select only name and id key/value pairs.

jq 'map(select(.fork == false)) | .[] | {id, name}' zaiste.json
{
  "id": 2187333,
  "name": "11.rupy.eu"
}
{
  "id": 44391427,
  "name": "ansible-clojure-web"
}
...

The result is sequence of adjacent JSON snippets. You may prefer an actual list.

jq '[map(select(.fork == false)) | .[] | {id, name}]' zaiste.json
[
  {
    "id": 2187333,
    "name": "11.rupy.eu"
  },
  {
    "id": 44391427,
    "name": "ansible-clojure-web"
  },
  ...
]

You may also need just values of a particular field e.g. a list of project names.

jq 'map(select(.fork == false)) | .[] | {id, name} | .name' zaiste.json
"11.rupy.eu"
"ansible-clojure-web"
...

It is possible to remove quotes using --raw-output parameter.

jq --raw-output 'map(select(.fork == false)) | .[] | {id, name} | .name' zaiste.json
11.rupy.eu
ansible-clojure-web

Let's change our data set from repositories to issues. We will fetch the first 30 open issues from nodejs/node repository.

http https://api.github.com/repos/nodejs/node/issues > node.json

Let's count the number of issues having more than one label assigned:

jq 'map(select((.labels|length)>1)) | length' node.json

Inside select argument expression, we are transforming the current element by filtering out the labels array. Its value is passed to the length filter.

Let's count the number of pull requests:

jq 'map(select(has("pull_request"))) | length' node.json

Let's select only issues in a closed state

jq 'map(select(.state|index("closed")))' node.json

Tips & Tricks

Merge several JSON files:

js -s add file1.json file2.json file3.json