id: version-0.66.0-org.apache.streampipes.processors.transformation.flink.processor.boilerplate title: Boilerplate Removal sidebar_label: Boilerplate Removal original_id: org.apache.streampipes.processors.transformation.flink.processor.boilerplate
Description
Removes boilerplate tags from HTML and extracts fulltext
Required input
Requires a Text field containing the HTML
Configuration
Select the extractor type and output mode
Output
Appends a new text field containing the content of the html page without the boilerplate