These scripts come with the stag and dbstag distributions
stag-autoschema.pl -w sxpr sample-data.xml
Takes a stag compatible file (xml, sxpr, itext), or a file in any format plus a parser, and writes out the implicit underlying stag-schema stag-schema should look relatively self-explanatory. Here is an example stag-schema, shown in sxpr syntax: (db (person* (name "s" (address+ (address_type "s") (street "s") (street2? "s") (city "s") (zip? "s"))))) The database db contains zero or more persons, each person has a mandatory name and at least one address. The cardinality mnemonics are as follows:
stag-db.pl -r person -k social_security_no -i ./person-idx myrecords.xml stag-db.pl -i ./person-idx -q 999-9999-9999 -q 888-8888-8888
Builds a simple file-based database for persistent storage and retrieval of nodes from a stag compatible document. Imagine you have a very large file of data, in a stag compatible format such as XML. You want to index all the elements of type B<person>; each person can be uniquely identified by B<social_security_no>, which is a direct subnode of B<person> The first thing to do is to build an index file, which will be stored in your current directory: stag-db.pl -r person -k social_security_no -i ./person-idx myrecords.xml You can then use the index "person-idx" to retrieve B<person> nodes by their social security number stag-db.pl -i ./person-idx -q 999-9999-9999 > some-person.xml You can export using different stag formats stag-db.pl -i ./person-idx -q 999-9999-9999 -w sxpr > some-person.xml You can retrieve multiple nodes (although these need to be rooted to make a valid file) stag-db.pl -i ./person-idx -q 999-9999-9999 -q 888-8888-8888 -top personset Or you can use a list of IDs from a file (newline delimited) stag-db.pl -i ./person-idx -qf my_ss_nmbrs.txt -top personset
stag-diff.pl -ignore foo-id -ignore bar-id file1.xml file2.xml
Compares two data trees and reports whether they match. If they do not match, the mismatch is reported.
stag-drawtree.pl -o my.png myfile.xml stag-drawtree.pl -p My::MyFormatParser -o my.png myfile.myfmt
requires GD library and GD perl module
stag-eval.pl '' file2.xml
stag-filter.pl person -q name=fred file1.xml stag-filter.pl person 'sub {shift->get_name =~ /^A*/}' file1.xml stag-filter.pl -p My::Foo -w sxpr record 'sub{..}' file2
parsers an input file using the specified parser (which may be a built in stag parser, such as xml) and filters the resulting stag tree according to a user-supplied subroutine, writing out only the nodes/elements that pass the test. the parser is event based, so it should be able to handle large files (although if the node you parse is large, it will take up more memory)
stag-findsubtree.pl 'person/name' file.xml
parses in an input file and writes out subnodes
stag-flatten.pl MyFile.xml dept/name dept/person/name
reads in a file in a stag format, and 'flattens' it to a tab-delimited table format. given this data: (company (dept (name "special-operations") (person (name "james-bond")) (person (name "fred")))) the above command will return a two column table special-operations james-bond special-operations fred
stag-grep.pl person -q name=fred file1.xml stag-grep.pl person 'sub {shift->get_name =~ /^A*/}' file1.xml stag-grep.pl -p My::Foo -w sxpr record 'sub{..}' file2
parsers an input file using the specified parser (which may be a built in stag parser, such as xml) and filters the resulting stag tree according to a user-supplied subroutine, writing out only the nodes/elements that pass the test. the parser is event based, so it should be able to handle large files (although if the node you parse is large, it will take up more memory)
stag-handle.pl -w itext -c my-handler.pl myfile.xml > processed.itext stag-handle.pl -w itext -p My::Parser -m My::Handler myfile.xml > processed.itext
will take a Stag compatible format (xml, sxpr or itext), turn the data into an event stream passing it through my-handler.pl
stag-join.pl -w xml country/city_id=capital/capital_id countries.xml capitals.xml stag-join.pl -w itext gene/tax_id=species/tax_id genedb.itext speciesdb.itext
Performs a relational-style INNER JOIN between two stag trees; this effectively merges two files together, based on some kind of ID in the file
stag-merge.pl -xml file1.xml file2.xml
script wrapper for the Data::Stag modules
stag-mogrify.pl -w itext file1.xml file2.xml
script wrapper for the Data::Stag modules feeds in files into a parser object that generates nestarray events, and feeds the events into a handler/writer class
# convert XML to IText stag-parse.pl -p xml -w itext file1.xml file2.xml # use a custom parser/generator and a custom writer/generator stag-parse.pl -p MyMod::MyParser -w MyMod::MyWriter file.txt
script wrapper for the Data::Stag modules feeds in files into a parser object that generates nestarray events, and feeds the events into a handler/writer class
stag-query.pl avg person/age file.xml stag-query.pl sum person/salary file.xml stag-query.pl 'sub { $agg .= ", ".shift }' person/name file.xml
Performs aggregate queries
stag-splitter.pl -split person -name social_security_no file.xml
Splits a file using a user specified parser (default xml) around a specified split node, naming each file according to the name argument the files will be named anonymously, unless the '-name' switch is specified; this will use the value of the specified element as the filename eg; if we have <top> <a> <b>foo</b> <c>yah</c> <d> <e>xxx</e> </d> </a> <a> <b>bar</b> <d> <e>wibble</e> </d> </a> </top> if we run stag-splitter.pl -split a -name b it will generate two files, "foo.xml" and "bar.xml" input format can be 'xml', 'sxpr' or 'itext' - if this is left blank the format will be guessed from the file suffix the output format defaults to the same as the input format, but another can be chosen. files go in the current directory, but this can be overridden with the '-dir' switch
stag-view.pl file1.xml
Draws a Tk tree, with expandable/convertable nodes
selectall_html.pl -d "dbi:Pg:dbname=mydb;host=localhost" "SELECT * FROM a NATURAL JOIN b"
selectall_xml.pl [-d <dbi>] [-f file of sql] [-nesting|n <nesting>] SQL
This script will query a database using either SQL provided by the script user, or using an SQL templates; the query results will be turned into XML using the L<DBIx::DBStag> module. The nesting of the XML can be controlled by the DBStag SQL extension "USE NESTING..."
stag-autoddl.pl -parser XMLAutoddl -handler ITextWriter file1.txt file2.txt stag-autoddl.pl -parser MyMod::MyParser -handler MyMod::MyWriter file.txt
script wrapper for the Data::Stag modules
# convert XML to IText stag-bulkload.pl -l person file1.xml file2.xml # use a custom parser/generator and a custom writer/generator stag-bulkload.pl -p MyMod::MyParser file.txt
Creates bulkload SQL statements for an input file Works only with certain kinds of schemas, where the FK relations make a tree (not a graph); i.e. the only FKs are to the parent You do not need a connection to the DB It is of no use for incremental loading - it assumes integer surrogate promary keys and starts these from 1
stag-ir.pl -r person -k social_security_no -d Pg:mydb myrecords.xml stag-ir.pl -d Pg:mydb -q 999-9999-9999 -q 888-8888-8888
Indexes stag nodes (XML Elements) in a simple relational db structure - keyed by ID with an XML Blob as a value Imagine you have a very large file of data, in a stag compatible format such as XML. You want to index all the elements of type B<person>; each person can be uniquely identified by B<social_security_no>, which is a direct subnode of B<person> The first thing to do is to build the index file, which will be stored in the database mydb stag-ir.pl -r person -k social_security_no -d Pg:mydb myrecords.xml You can then use the index "person-idx" to retrieve B<person> nodes by their social security number stag-ir.pl -d Pg:mydb -q 999-9999-9999 > some-person.xml You can export using different stag formats stag-ir.pl -d Pg:mydb -q 999-9999-9999 -w sxpr > some-person.xml You can retrieve multiple nodes (although these need to be rooted to make a valid file) stag-ir.pl -d Pg:mydb -q 999-9999-9999 -q 888-8888-8888 -top personset Or you can use a list of IDs from a file (newline delimited) stag-ir.pl -d Pg:mydb -qf my_ss_nmbrs.txt -top personset
stag-pgslurp.pl -d "dbi:Pg:dbname=mydb;host=localhost" myfile.xml
This script is for storing data (specified in a nested file format such as XML or S-Expressions) in a database. It assumes a database schema corresponding to the tags in the input data already exists.
stag-storenode.pl -d "dbi:Pg:dbname=mydb;host=localhost" myfile.xml
This script is for storing data (specified in a nested file format such as XML or S-Expressions) in a database. It assumes a database schema corresponding to the tags in the input data already exists.