This is an extension to Scufl2 API which provides the export capability to wfdesc ontology from the Wf4Ever [2] RO wfdesc ontology.
You can use the tavlang convert command line tool for converting Taverna workflows to wfdesc.
The output is a RDF Turtle document containing statements about the workflow structure according to the RO ontology wfdesc ontology, with additional service descriptions using the wf4ever ontology where supported.
: stain@ralph ~/src/wf4ever/scufl2-wfdesc; cat src/test/resources/helloworld.wfdesc.ttl
@base <http://ns.taverna.org.uk/2010/workflowBundle/8781d5f4-d0ba-48a8-a1d1-14281bd8a917/workflow/Hello_World/> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . @prefix owl: <http://www.w3.org/2002/07/owl#> . @prefix prov: <http://www.w3.org/ns/prov#> . @prefix wfdesc: <http://purl.org/wf4ever/wfdesc#> . @prefix wf4ever: <http://purl.org/wf4ever/wf4ever#> . @prefix roterms: <http://purl.org/wf4ever/roterms#> . @prefix dc: <http://purl.org/dc/elements/1.1/> . @prefix dcterms: <http://purl.org/dc/terms/> . @prefix comp: <http://purl.org/DP/components#> . @prefix dep: <http://scape.keep.pt/vocab/dependencies#> . @prefix biocat: <http://biocatalogue.org/attribute/> . @prefix : <#> . <datalink?from=processor/hello/out/value&to=out/greeting> a wfdesc:DataLink ; wfdesc:hasSource <processor/hello/out/value> ; wfdesc:hasSink <out/greeting> . <> a wfdesc:Workflow , wfdesc:Description , wfdesc:Process ; dc:creator "Stian Soiland-Reyes" ; dcterms:description "One of the simplest workflows possible. No workflow input ports, a single workflow output port \"greeting\", outputting \"Hello, world!\" as produced by the String Constant \"hello\"." ; dcterms:title "Hello World" ; rdfs:label "Hello_World" ; wfdesc:hasOutput <out/greeting> ; wfdesc:hasSubProcess <processor/hello/> ; wfdesc:hasDataLink <datalink?from=processor/hello/out/value&to=out/greeting> . <out/greeting> a wfdesc:Output , wfdesc:Description , wfdesc:Input ; rdfs:label "greeting" . <processor/hello/> a wfdesc:Process , wfdesc:Description ; rdfs:label "hello" ; wfdesc:hasOutput <processor/hello/out/value> . <processor/hello/out/value> a wfdesc:Output , wfdesc:Description ; rdfs:label "value" .
Annotations in the workflow are also extracted for the workflow, processors and input ports (see namespaces above):
Richer semantic annotations (e.g. on Taverna Components) are extracted verbatim, e.g.:
<> a wfdesc:Workflow , wfdesc:Description , wfdesc:Process ;
comp:fits comp:MigrationAction ;
comp:migrates _:node18musbm56x1 .
_:node18musbm56x1 a comp:MigrationPath ;
comp:fromMimetype "image/tiff" ;
comp:toMimetype "image/tiff" .
See valid_component_imagemagickconvert.wfdesc.ttl
for the complete example.
Example:
import org.apache.taverna.scufl2.api.container.WorkflowBundle; import org.apache.taverna.scufl2.api.io.ReaderException; import org.apache.taverna.scufl2.api.io.WorkflowBundleIO; import org.apache.taverna.scufl2.api.io.WriterException; .. File original = new File("helloworld.t2flow"); File output = new File("helloworld.wfdesc.ttl"); String original = null; // to guess filetype WorkflowBundle wfBundle = io.readBundle(original, null); io.writeBundle(wfBundle, output, "text/vnd.wf4ever.wfdesc+turtle");
These queries can be executed using SPARQL on the returned wfdesc Turtle, and assume these PREFIXes:
PREFIX prov: <http://www.w3.org/ns/prov#> PREFIX wfdesc: <http://purl.org/wf4ever/wfdesc#> PREFIX wfprov: <http://purl.org/wf4ever/wfprov#> PREFIX tavernaprov: <http://ns.taverna.org.uk/2012/tavernaprov/> PREFIX cnt: <http://www.w3.org/2011/content#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX wf4ever: <http://purl.org/wf4ever/wf4ever#> PREFIX comp: <http://purl.org/DP/components#>
The wfdesc model does not have a concept of “top-level” workflow, however you can use a SPARQL NOT EXISTS
filter to skip any nested workflows:
SELECT ?wf WHERE { ?wf a wfdesc:Workflow . FILTER NOT EXISTS { ?processInParent prov:specializationOf ?wf . ?parent wfdesc:hasSubProcess ?processInParent } . }
Thus any workflow ?wf
which (specialization) has been used as a sub-process in ?parent
are excluded from the result. As SCUFL2 workflows have unique URIs within each workflow bundle, e.g. http://ns.taverna.org.uk/2010/workflowBundle/01348671-5aaa-4cc2-84cc-477329b70b0d/workflow/Hello_Anyone/ - the above would not be excluding a workflow just because its structure has been reused elsewhere.
A variant of the above is if you are only interested in processors in top-level workflows:
SELECT ?proc WHERE { ?wf a wfdesc:Workflow ; wfdesc:hasSubProcess ?proc . FILTER NOT EXISTS { ?processInParent prov:specializationOf ?wf . ?parent wfdesc:hasSubProcess ?processInParent } . }