From $OBJ
See also: Web/Data Scraping & Automation Best Practices
Best Practices: - Always check for
$ERR
when accessing attributes or elements that may not exist. - Use functional chaining (.map()
,.reduce()
,.filter()
) for processing extraction results. - Prefer.findall()
for complex queries and nested extraction patterns.
data = json(rawdata)
data = xml(rawxml)
data = html(rawhtml)
data = findall(term)
Searches for all occurances of a match against JSON, XML or HTML data.
Each item in search is optional.
The has, nhas, and, nand, or, nor are all recursive.
n in nhas, nand, and nor indicates negation.
search = {
name:"",
value:"",
attr:attrlist,
has:searchlist,
nhas:searchlist,
and:searchlist,
nand:searchlist,
or:searchlist,
nor:searchlist
}
attrlist = {attr, attr, etc}
attr = name: value
searchlist = [search, search, etc]
Advanced Usage: Extracting Data from HTML
/* Parse HTML and extract all <a> tags with an <img> child */
html = $file().get("page.html").str().html();
anchors_with_images = html.body.findall({name:"a", has:{name:"img"}});
/* Extract hrefs, filtering out any <a> without an href attribute */
hrefs = anchors_with_images.reduce(op(acc, a) {
if (a.$LIST.href.type() != $ERR) {
acc += a.$LIST.href;
}
}, []);
Explanation: - findall({name:"a", has:{name:"img"}})
finds all <a>
tags that contain an <img>
child. - The .reduce()
collects the href
attributes, skipping any that are missing.
Query Syntax Recap
name
: Tag name to match (e.g.,"a"
,"div"
)attr
: Match attributes (e.g.,{id:"main", class:"header"}
)has
: Require child/descendant elements matching a querynhas
: Require absence of child/descendant elementsand
,or
,nand
,nor
: Combine multiple queries (logical operations)
Best Practices
- Chain
.findall()
calls to traverse nested structures. - Use functional methods (
.map()
,.reduce()
) to process results. - Always check for
$ERR
when accessing properties that may not exist.
Note on JSON and XML
The same .findall()
principles apply to complex JSON and XML documents. You can use the same query patterns to extract nested data, filter by attributes/keys, and process results with functional methods.
Note on .get() for $OBJ
There is no built-in .get()
method for $OBJ in Grapa. If you want a .get()
method, you must define it yourself as a member function in your class. Its behavior is entirely determined by your implementation.
Example: / You must define get() yourself if you want it /
Person = class {
name = "";
age = 0;
get = op() { {"name": name, "age": age}; };
};
p = obj Person;
p.get();
If you do not define a .get()
method, calling p.get()
will result in an error.
.get()
is not like Python or JavaScript, where it is built-in for objects/dicts.
See Also: - Language Reference - Python-to-Grapa Migration Guide - JS-to-Grapa Migration Guide - Examples