Pulling out IDs from objects and back also -- Part II
The last post about JQ was about converting data in object keys to attributes and back again. Because objects usually have mulitple keys this also implies conversion between one object and an array of objects.
This time the data is readily available as attributes and can be addressed with
a simple .attr
expression. Also the (initial) focus is a single object. But even then
there are some nifty things to be explained.
One Object
Given this example input:
{
"id": "foo",
"data": {
"x1": 42,
"x2": 127
}
}
the expected result is:
{
"id": "foo",
"x1": 42,
"x2": 127
}
The most direct way to do this is this filter:
.data.id = .id | .data
although the id
attribute is added at the end – which is semantically the
same JSON, but might occassionaly annoy the human reader (like me). But there’s
an easy fix for this:
{id} + .data
And it’s shorter too!
If the key id
should be renamed in the lower object, then the syntactic
abbreviation {id}
must be expanded like this:
{new_id: .id} + .data
So merging objects using +
allows me to control the order of the keys.
The reverse operation is this:
{ id, data: del(.id) }
This time there are no more tricks up the sleeve.
With Arrays
If data
is not an object itself but an array of objects then pushing id
down is also easily possible. Given this input
{
"id": "foo",
"data": [
{
"x1": 42,
"x2": 127
},
{
"x1": 123,
"x2": 456
}
]
}
the expected output is this:
[
{
"id": "foo",
"x1": 42,
"x2": 127
},
{
"id": "foo",
"x1": 123,
"x2": 456
}
]
This can be done easily using this filter:
[ {id} + .data[] ]
At first glace the reverse operation is also easy:
{
id: .[0].id,
data: map(del(.id))
}
Indeed this reverses the output given above to its original input. But clearly
it works only correctly when all id
values are identical.
But if the id
values are not identical?
So given this input
[
{
"id": "foo",
"x1": 42
},
{
"id": "foo",
"y1": 123
},
{
"id": "bar",
"z1": 999
}
]
the expected outut is this:
[
{
"id": "bar",
"data": [
{
"z1": 999
}
]
},
{
"id": "foo",
"data": [
{
"x1": 42
},
{
"y1": 123
}
]
}
]
A simple solution is to group the input array by .id
and apply the above
filter to each group like this:
group_by(.id) | map({id: .[0].id, data: map(del(.id))})
There is a very similar filter in the last post, which
could be optimized into one reduce
filter. That’s not the case here! The
difference is that the reduce
filter produces an object where the keys
themself contain the data and these keys are also used by reduce
for
grouping the data. But the output format in this case requires additional
postprosessing anyways – so I think the group_by|map
solution is the best
solution here.
Note that I present this solution ony for completeness sake because I see no
practical use for this output format. Either I’d use the output of
pullout_groups_by
where I can access the data directly with a key. In this
case the additional data
container is just noise. Or I’d use the output of
group_by(.id)
directly.
Summary
Using the +
expression avoids the clumsy “assign and extract value” idiom.
Controlling the order of the attributes is a nice extra on top of that.