2017-10-08 12:00:15 -04:00
|
|
|
# pup
|
|
|
|
|
2017-10-10 19:00:49 -04:00
|
|
|
> Command line HTML parsing tool.
|
2017-10-08 12:00:15 -04:00
|
|
|
|
2017-10-10 19:00:49 -04:00
|
|
|
- Transform a raw HTML file into a cleaned, indented, and colored format:
|
2017-10-08 12:00:15 -04:00
|
|
|
|
|
|
|
`cat {{index.html}} | pup --color`
|
|
|
|
|
|
|
|
- Filter HTML by element tag name:
|
|
|
|
|
2017-10-11 08:17:46 -04:00
|
|
|
`cat {{index.html}} | pup '{{tag}}'`
|
2017-10-08 12:00:15 -04:00
|
|
|
|
|
|
|
- Filter HTML by id:
|
|
|
|
|
2017-10-10 19:00:49 -04:00
|
|
|
`cat {{index.html}} | pup '{{div#id}}'`
|
2017-10-08 12:00:15 -04:00
|
|
|
|
|
|
|
- Filter HTML by attribute value:
|
|
|
|
|
2017-10-10 19:05:27 -04:00
|
|
|
`cat {{index.html}} | pup '{{input[type="text"]}}'`
|
2017-10-08 12:00:15 -04:00
|
|
|
|
2017-10-10 19:00:49 -04:00
|
|
|
- Print all text from the filtered HTML elements and their children:
|
2017-10-08 12:00:15 -04:00
|
|
|
|
2017-10-10 19:00:49 -04:00
|
|
|
`cat {{index.html}} | pup '{{div}} text{}'`
|
2017-10-08 12:00:15 -04:00
|
|
|
|
|
|
|
- Print HTML as JSON:
|
|
|
|
|
2017-10-10 19:00:49 -04:00
|
|
|
`cat {{index.html}} | pup '{{div}} json{}'`
|