jq is a command-line tool to transform (filter, slice, map, etc.) JSON files. It treats JSON data as streams similar to AWK which treats text files as streams.
Let's use data from GitHub as our playground. We will start by fetching a list of repositories as JSON. Feel free to replace zaiste
username with any other handle. We will store the result as zaiste.json
for convenience. You can also pipe the response directly to jq
.
http https://api.github.com/users/zaiste/repos > zaiste.json
GitHub API returns up to 30 entries for each request. If you need more, you need paginate by following the Link
response header, e.g. https://api.github.com/users/zaiste/repos?page=1
for the next page of my public repositories.
Let's verify we got 30 repositories.
jq length zaiste.json
Let's reduce that list to only source repositories, i.e. excluding forks.
jq 'map(select(.fork == false)) | length' zaiste.json
map(select(.fork == false))
is equivalent to [.[] | select(.fork == false))]
. If select
argument expression returns true for the current element, it returns that element unmodified. If its argument returns false, select
outputs nothing.
Each repository has many fields. Let's select only name
and id
key/value pairs.
jq 'map(select(.fork == false)) | .[] | {id, name}' zaiste.json
{
"id": 2187333,
"name": "11.rupy.eu"
}
{
"id": 44391427,
"name": "ansible-clojure-web"
}
...
The result is sequence of adjacent JSON snippets. You may prefer an actual list.
jq '[map(select(.fork == false)) | .[] | {id, name}]' zaiste.json
[
{
"id": 2187333,
"name": "11.rupy.eu"
},
{
"id": 44391427,
"name": "ansible-clojure-web"
},
...
]
You may also need just values of a particular field e.g. a list of project names.
jq 'map(select(.fork == false)) | .[] | {id, name} | .name' zaiste.json
"11.rupy.eu"
"ansible-clojure-web"
...
It is possible to remove quotes using --raw-output
parameter.
jq --raw-output 'map(select(.fork == false)) | .[] | {id, name} | .name' zaiste.json
11.rupy.eu
ansible-clojure-web
Let's change our data set from repositories to issues. We will fetch the first 30 open issues from nodejs/node
repository.
http https://api.github.com/repos/nodejs/node/issues > node.json
Let's count the number of issues having more than one label assigned:
jq 'map(select((.labels|length)>1)) | length' node.json
Inside select
argument expression, we are transforming the current element by filtering out the labels
array. Its value is passed to the length filter.
Let's count the number of pull requests:
jq 'map(select(has("pull_request"))) | length' node.json
Let's select only issues in a closed
state
jq 'map(select(.state|index("closed")))' node.json
Tips & Tricks
Merge several JSON files:
js -s add file1.json file2.json file3.json