website/versioned_docs/version-0.65.0-pre-asf/org.apache.streampipes.processors.textmining.jvm.chunker/documentation.md

id: version-0.65.0-pre-asf-org.apache.streampipes.processors.textmining.jvm.chunker title: Chunker (English) sidebar_label: Chunker (English) original_id: org.apache.streampipes.processors.textmining.jvm.chunker

Segments given tokens into chunks (e.g. noun groups, verb groups, ...) and appends the found chunks to the stream.

Needs a stream with two string list properties:

A list of tokens
A list of part-of-speech tags (the Part-of-Speech processing element can be used for that)

Assign the tokens and the part of speech tags to the corresponding stream property.

Example:

Input:

tokens: ["John", "is", "a", "Person"]
tags: ["NNP", "VBZ", "DT", "NN"]

Output:

tokens: ["John", "is", "a", "Person"]
tags: ["NNP", "VBZ", "DT", "NN"]
chunks: ["John", "is", "a Person"]
chunkType: ["NP", "VP", "NP"])