SERP Tips: Difference between revisions

From CommunityData
No edit summary
Line 1: Line 1:
===Using the JSON data===
===Using the JSON data===


The .7z files from SERP include .json files. You can use the jq tool to quickly navigate the .json and dig out just what you care about. For example, if you are a command-line user and wanted only URLS, with jq you can:
The .7z files from SERP include .json files.  
 
They're not "pretty printed" -- they're not nicely formatted, they're collapsed into big long strings. Fortunately there are tools out there to pretty print .json. Copy the .json text into the box on [https://jsonformatter.org/json-pretty-print] and hit the pretty print button for a look at the file in a way that respects the structure created by the symbols.
 
You can use the ''jq'' tool to quickly navigate the .json and dig out just what you care about. For example, if you are a command-line user and wanted only URLS, with jq you can:


   cat 'Sat Mar 28 2020 19-12-13 GMT-0500 (Central Daylight Time).json' | jq '.linkElements | .[] | .href'
   cat 'Sat Mar 28 2020 19-12-13 GMT-0500 (Central Daylight Time).json' | jq '.linkElements | .[] | .href'

Revision as of 03:50, 16 October 2020

Using the JSON data

The .7z files from SERP include .json files.

They're not "pretty printed" -- they're not nicely formatted, they're collapsed into big long strings. Fortunately there are tools out there to pretty print .json. Copy the .json text into the box on [1] and hit the pretty print button for a look at the file in a way that respects the structure created by the symbols.

You can use the jq tool to quickly navigate the .json and dig out just what you care about. For example, if you are a command-line user and wanted only URLS, with jq you can:

 cat 'Sat Mar 28 2020 19-12-13 GMT-0500 (Central Daylight Time).json' | jq '.linkElements | .[] | .href'


This will: send the text of the .json file into jq, then navigate the tree to just the 'linkElements' list of links, then iterate over each item in the list, then select only the 'href' trait (i.e. the URL) from each link in the list.