diff --git a/pages/common/htmlq.md b/pages/common/htmlq.md new file mode 100644 index 0000000000..ca56e69791 --- /dev/null +++ b/pages/common/htmlq.md @@ -0,0 +1,24 @@ +# htmlq + +> Use CSS selectors to extract content from HTML files. +> More information: . + +- Return all elements of class `card`: + +`cat {{path/to/file.html}} | htmlq '.card'` + +- Get the text content of the first paragraph: + +`cat {{path/to/file.html}} | htmlq --text 'p:first-of-type'` + +- Find all the links in a page: + +`cat {{path/to/file.html}} | htmlq --attribute href 'a'` + +- Remove all images and SVGs from a page: + +`cat {{path/to/file.html}} | htmlq --remove-nodes 'img' --remove-nodes 'svg'` + +- Pretty print and write the output to a file: + +`htmlq --pretty --filename {{path/to/input.html}} --output {{path/to/output.html}}`