From 549ba9c42f290d73f2ec934c2f4afd62e6ec3257 Mon Sep 17 00:00:00 2001 From: J Wong Date: Fri, 7 Feb 2025 01:21:00 -0500 Subject: [PATCH] htmlq: add page (#15625) Co-authored-by: Wiktor Perskawiec Co-authored-by: Juri Dispan Co-authored-by: Managor <42655600+Managor@users.noreply.github.com> --- pages/common/htmlq.md | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) create mode 100644 pages/common/htmlq.md diff --git a/pages/common/htmlq.md b/pages/common/htmlq.md new file mode 100644 index 0000000000..ca56e69791 --- /dev/null +++ b/pages/common/htmlq.md @@ -0,0 +1,24 @@ +# htmlq + +> Use CSS selectors to extract content from HTML files. +> More information: . + +- Return all elements of class `card`: + +`cat {{path/to/file.html}} | htmlq '.card'` + +- Get the text content of the first paragraph: + +`cat {{path/to/file.html}} | htmlq --text 'p:first-of-type'` + +- Find all the links in a page: + +`cat {{path/to/file.html}} | htmlq --attribute href 'a'` + +- Remove all images and SVGs from a page: + +`cat {{path/to/file.html}} | htmlq --remove-nodes 'img' --remove-nodes 'svg'` + +- Pretty print and write the output to a file: + +`htmlq --pretty --filename {{path/to/input.html}} --output {{path/to/output.html}}`