| <?xml version="1.0" encoding="UTF-8"?> |
| <project version="4"> |
| <component name="StudySettings"> |
| <StudyTaskManager> |
| <option name="VERSION" value="14" /> |
| <option name="myUserTests"> |
| <map /> |
| </option> |
| <option name="course"> |
| <EduCourse> |
| <option name="authors"> |
| <list> |
| <StepikUserInfo> |
| <option name="firstName" value="Henry" /> |
| <option name="id" value="48485817" /> |
| <option name="lastName" value="Suryawirawan" /> |
| </StepikUserInfo> |
| </list> |
| </option> |
| <option name="compatible" value="true" /> |
| <option name="courseMode" value="Course Creator" /> |
| <option name="createDate" value="1557824500323" /> |
| <option name="customPresentableName" /> |
| <option name="description" value="This course provides a series of katas to get familiar with Apache Beam. Apache Beam website – https://beam.apache.org/" /> |
| <option name="environment" value="" /> |
| <option name="fromZip" value="false" /> |
| <option name="id" value="54532" /> |
| <option name="index" value="-1" /> |
| <option name="instructors"> |
| <list> |
| <option value="48485817" /> |
| </list> |
| </option> |
| <option name="language" value="Python 2.7" /> |
| <option name="languageCode" value="en" /> |
| <option name="name" value="Beam Katas - Python" /> |
| <option name="public" value="true" /> |
| <option name="sectionIds"> |
| <list /> |
| </option> |
| <option name="stepikChangeStatus" value="Up to date" /> |
| <option name="type" value="pycharm11 Python 2.7" /> |
| <option name="updateDate" value="1560937766000" /> |
| <option name="items"> |
| <list> |
| <Section> |
| <option name="courseId" value="54532" /> |
| <option name="customPresentableName" /> |
| <option name="id" value="85644" /> |
| <option name="index" value="1" /> |
| <option name="name" value="Introduction" /> |
| <option name="position" value="0" /> |
| <option name="stepikChangeStatus" value="Up to date" /> |
| <option name="updateDate" value="1559325495000" /> |
| <option name="items"> |
| <list> |
| <Lesson> |
| <option name="customPresentableName" /> |
| <option name="id" value="238426" /> |
| <option name="index" value="1" /> |
| <option name="name" value="Hello Beam" /> |
| <option name="stepikChangeStatus" value="Up to date" /> |
| <option name="updateDate" value="1560937886298" /> |
| <option name="unitId" value="210886" /> |
| <option name="items"> |
| <list> |
| <EduTask> |
| <option name="customPresentableName" /> |
| <option name="descriptionFormat" value="HTML" /> |
| <option name="descriptionText" value="<!-- ~ Licensed to the Apache Software Foundation (ASF) under one ~ or more contributor license agreements. See the NOTICE file ~ distributed with this work for additional information ~ regarding copyright ownership. The ASF licenses this file ~ to you under the Apache License, Version 2.0 (the ~ "License"); you may not use this file except in compliance ~ with the License. You may obtain a copy of the License at ~ ~ http://www.apache.org/licenses/LICENSE-2.0 ~ ~ Unless required by applicable law or agreed to in writing, software ~ distributed under the License is distributed on an "AS IS" BASIS, ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. ~ See the License for the specific language governing permissions and ~ limitations under the License. --> <html> <h2>Hello Beam Pipeline</h2> <p> Apache Beam is an open source, unified model for defining both batch and streaming data-parallel processing pipelines. Using one of the open source Beam SDKs, you build a program that defines the pipeline. The pipeline is then executed by one of Beam’s supported distributed processing back-ends, which include Apache Apex, Apache Flink, Apache Spark, and Google Cloud Dataflow. </p> <p> Beam is particularly useful for Embarrassingly Parallel data processing tasks, in which the problem can be decomposed into many smaller bundles of data that can be processed independently and in parallel. You can also use Beam for Extract, Transform, and Load (ETL) tasks and pure data integration. These tasks are useful for moving data between different storage media and data sources, transforming data into a more desirable format, or loading data onto a new system. </p> <p> To learn more about Apache Beam, refer to <a href="https://beam.apache.org/get-started/beam-overview/">Apache Beam Overview</a>. </p> <p> <b>Kata:</b> Your first kata is to create a simple pipeline that takes a hardcoded input element "Hello Beam". </p> <br> <div class="hint"> Hardcoded input can be created using <a href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.Create"> Create</a>. </div> <div class="hint"> Refer to the Beam Programming Guide <a href="https://beam.apache.org/documentation/programming-guide/#creating-pcollection-in-memory"> "Creating a PCollection from in-memory data"</a> section for more information. </div> </html> " /> |
| <option name="feedbackLink"> |
| <FeedbackLink> |
| <option name="link" /> |
| <option name="type" value="STEPIK" /> |
| </FeedbackLink> |
| </option> |
| <option name="id" value="755575" /> |
| <option name="index" value="1" /> |
| <option name="name" value="Hello Beam" /> |
| <option name="record" value="-1" /> |
| <option name="status" value="Unchecked" /> |
| <option name="stepikChangeStatus" value="Up to date" /> |
| <option name="files"> |
| <map> |
| <entry key="task.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list> |
| <AnswerPlaceholder> |
| <option name="hints"> |
| <list /> |
| </option> |
| <option name="index" value="0" /> |
| <option name="initialState" /> |
| <option name="initializedFromDependency" value="false" /> |
| <option name="length" value="6" /> |
| <option name="offset" value="903" /> |
| <option name="placeholderDependency" /> |
| <option name="placeholderText" value="TODO()" /> |
| <option name="possibleAnswer" value="beam.Create(['Hello Beam'])" /> |
| <option name="selected" value="false" /> |
| <option name="status" value="Unchecked" /> |
| <option name="studentAnswer" /> |
| <option name="useLength" value="false" /> |
| </AnswerPlaceholder> |
| </list> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="task.py" /> |
| <option name="text" value="# TODO: type solution here " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="true" /> |
| </TaskFile> |
| </value> |
| </entry> |
| <entry key="tests.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list /> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="tests.py" /> |
| <option name="text" value="from test_helper import run_common_tests, failed, passed, get_answer_placeholders def test_answer_placeholders(): placeholders = get_answer_placeholders() placeholder = placeholders[0] if placeholder == "": # TODO: your condition here passed() else: failed() if __name__ == '__main__': run_common_tests() # test_answer_placeholders() # TODO: uncomment test call " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="false" /> |
| </TaskFile> |
| </value> |
| </entry> |
| </map> |
| </option> |
| <option name="updateDate" value="1560937891911" /> |
| </EduTask> |
| </list> |
| </option> |
| </Lesson> |
| </list> |
| </option> |
| </Section> |
| <Section> |
| <option name="courseId" value="54532" /> |
| <option name="customPresentableName" /> |
| <option name="id" value="85645" /> |
| <option name="index" value="2" /> |
| <option name="name" value="Core Transforms" /> |
| <option name="position" value="0" /> |
| <option name="stepikChangeStatus" value="Up to date" /> |
| <option name="updateDate" value="1560432551000" /> |
| <option name="items"> |
| <list> |
| <Lesson> |
| <option name="customPresentableName" /> |
| <option name="id" value="238427" /> |
| <option name="index" value="1" /> |
| <option name="name" value="Map" /> |
| <option name="stepikChangeStatus" value="Content changed" /> |
| <option name="updateDate" value="1560937929994" /> |
| <option name="unitId" value="210887" /> |
| <option name="items"> |
| <list> |
| <EduTask> |
| <option name="customPresentableName" /> |
| <option name="descriptionFormat" value="HTML" /> |
| <option name="descriptionText" value="<!-- ~ Licensed to the Apache Software Foundation (ASF) under one ~ or more contributor license agreements. See the NOTICE file ~ distributed with this work for additional information ~ regarding copyright ownership. The ASF licenses this file ~ to you under the Apache License, Version 2.0 (the ~ "License"); you may not use this file except in compliance ~ with the License. You may obtain a copy of the License at ~ ~ http://www.apache.org/licenses/LICENSE-2.0 ~ ~ Unless required by applicable law or agreed to in writing, software ~ distributed under the License is distributed on an "AS IS" BASIS, ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. ~ See the License for the specific language governing permissions and ~ limitations under the License. --> <html> <h2>ParDo</h2> <p> ParDo is a Beam transform for generic parallel processing. The ParDo processing paradigm is similar to the “Map” phase of a Map/Shuffle/Reduce-style algorithm: a ParDo transform considers each element in the input PCollection, performs some processing function (your user code) on that element, and emits zero, one, or multiple elements to an output PCollection. </p> <p> <b>Kata:</b> Please write a simple ParDo that maps the input element by multiplying it by 10. </p> <br> <div class="hint"> Override <a href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.DoFn.process"> process</a> method. </div> <div class="hint"> Use <a href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.ParDo"> ParDo</a> with <a href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.DoFn">DoFn</a>. </div> <div class="hint"> Refer to the Beam Programming Guide <a href="https://beam.apache.org/documentation/programming-guide/#pardo">"ParDo"</a> section for more information. </div> </html> " /> |
| <option name="feedbackLink"> |
| <FeedbackLink> |
| <option name="link" /> |
| <option name="type" value="STEPIK" /> |
| </FeedbackLink> |
| </option> |
| <option name="id" value="755577" /> |
| <option name="index" value="1" /> |
| <option name="name" value="ParDo" /> |
| <option name="record" value="-1" /> |
| <option name="status" value="Unchecked" /> |
| <option name="stepikChangeStatus" value="Info and Content changed" /> |
| <option name="files"> |
| <map> |
| <entry key="task.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list> |
| <AnswerPlaceholder> |
| <option name="hints"> |
| <list /> |
| </option> |
| <option name="index" value="0" /> |
| <option name="initialState" /> |
| <option name="initializedFromDependency" value="false" /> |
| <option name="length" value="6" /> |
| <option name="offset" value="919" /> |
| <option name="placeholderDependency" /> |
| <option name="placeholderText" value="TODO()" /> |
| <option name="possibleAnswer" value="def process(self, element): yield element * 10" /> |
| <option name="selected" value="false" /> |
| <option name="status" value="Unchecked" /> |
| <option name="studentAnswer" /> |
| <option name="useLength" value="false" /> |
| </AnswerPlaceholder> |
| <AnswerPlaceholder> |
| <option name="hints"> |
| <list /> |
| </option> |
| <option name="index" value="1" /> |
| <option name="initialState" /> |
| <option name="initializedFromDependency" value="false" /> |
| <option name="length" value="6" /> |
| <option name="offset" value="1036" /> |
| <option name="placeholderDependency" /> |
| <option name="placeholderText" value="TODO()" /> |
| <option name="possibleAnswer" value="beam.ParDo(MultiplyByTenDoFn())" /> |
| <option name="selected" value="false" /> |
| <option name="status" value="Unchecked" /> |
| <option name="studentAnswer" /> |
| <option name="useLength" value="false" /> |
| </AnswerPlaceholder> |
| </list> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="task.py" /> |
| <option name="text" value="# TODO: type solution here " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="true" /> |
| </TaskFile> |
| </value> |
| </entry> |
| <entry key="tests.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list /> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="tests.py" /> |
| <option name="text" value="from test_helper import run_common_tests, failed, passed, get_answer_placeholders def test_answer_placeholders(): placeholders = get_answer_placeholders() placeholder = placeholders[0] if placeholder == "": # TODO: your condition here passed() else: failed() if __name__ == '__main__': run_common_tests() # test_answer_placeholders() # TODO: uncomment test call " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="false" /> |
| </TaskFile> |
| </value> |
| </entry> |
| </map> |
| </option> |
| <option name="updateDate" value="1560937936091" /> |
| </EduTask> |
| <EduTask> |
| <option name="customPresentableName" /> |
| <option name="descriptionFormat" value="HTML" /> |
| <option name="descriptionText" value="<!-- ~ Licensed to the Apache Software Foundation (ASF) under one ~ or more contributor license agreements. See the NOTICE file ~ distributed with this work for additional information ~ regarding copyright ownership. The ASF licenses this file ~ to you under the Apache License, Version 2.0 (the ~ "License"); you may not use this file except in compliance ~ with the License. You may obtain a copy of the License at ~ ~ http://www.apache.org/licenses/LICENSE-2.0 ~ ~ Unless required by applicable law or agreed to in writing, software ~ distributed under the License is distributed on an "AS IS" BASIS, ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. ~ See the License for the specific language governing permissions and ~ limitations under the License. --> <html> <h2>ParDo OneToMany</h2> <p> <b>Kata:</b> Please write a ParDo that maps each input sentence into words tokenized by whitespace (" "). </p> <br> <div class="hint"> Override <a href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.DoFn.process"> process</a> method. You can return an Iterable for multiple elements or call "yield" for each element to return a generator. </div> <div class="hint"> Use <a href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.ParDo"> ParDo</a> with <a href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.DoFn"> DoFn</a>. </div> <div class="hint"> Refer to the Beam Programming Guide <a href="https://beam.apache.org/documentation/programming-guide/#pardo">"ParDo"</a> section for more information. </div> </html> " /> |
| <option name="feedbackLink"> |
| <FeedbackLink> |
| <option name="link" /> |
| <option name="type" value="STEPIK" /> |
| </FeedbackLink> |
| </option> |
| <option name="id" value="755578" /> |
| <option name="index" value="2" /> |
| <option name="name" value="ParDo OneToMany" /> |
| <option name="record" value="-1" /> |
| <option name="status" value="Unchecked" /> |
| <option name="stepikChangeStatus" value="Info and Content changed" /> |
| <option name="files"> |
| <map> |
| <entry key="task.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list> |
| <AnswerPlaceholder> |
| <option name="hints"> |
| <list /> |
| </option> |
| <option name="index" value="0" /> |
| <option name="initialState" /> |
| <option name="initializedFromDependency" value="false" /> |
| <option name="length" value="6" /> |
| <option name="offset" value="920" /> |
| <option name="placeholderDependency" /> |
| <option name="placeholderText" value="TODO()" /> |
| <option name="possibleAnswer" value="def process(self, element): return element.split()" /> |
| <option name="selected" value="false" /> |
| <option name="status" value="Unchecked" /> |
| <option name="studentAnswer" /> |
| <option name="useLength" value="false" /> |
| </AnswerPlaceholder> |
| <AnswerPlaceholder> |
| <option name="hints"> |
| <list /> |
| </option> |
| <option name="index" value="1" /> |
| <option name="initialState" /> |
| <option name="initializedFromDependency" value="false" /> |
| <option name="length" value="6" /> |
| <option name="offset" value="1057" /> |
| <option name="placeholderDependency" /> |
| <option name="placeholderText" value="TODO()" /> |
| <option name="possibleAnswer" value="beam.ParDo(BreakIntoWordsDoFn())" /> |
| <option name="selected" value="false" /> |
| <option name="status" value="Unchecked" /> |
| <option name="studentAnswer" /> |
| <option name="useLength" value="false" /> |
| </AnswerPlaceholder> |
| </list> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="task.py" /> |
| <option name="text" value="# TODO: type solution here " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="true" /> |
| </TaskFile> |
| </value> |
| </entry> |
| <entry key="tests.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list /> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="tests.py" /> |
| <option name="text" value="from test_helper import run_common_tests, failed, passed, get_answer_placeholders def test_answer_placeholders(): placeholders = get_answer_placeholders() placeholder = placeholders[0] if placeholder == "": # TODO: your condition here passed() else: failed() if __name__ == '__main__': run_common_tests() # test_answer_placeholders() # TODO: uncomment test call " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="false" /> |
| </TaskFile> |
| </value> |
| </entry> |
| </map> |
| </option> |
| <option name="updateDate" value="1560937938522" /> |
| </EduTask> |
| <EduTask> |
| <option name="customPresentableName" /> |
| <option name="descriptionFormat" value="HTML" /> |
| <option name="descriptionText" value="<!-- ~ Licensed to the Apache Software Foundation (ASF) under one ~ or more contributor license agreements. See the NOTICE file ~ distributed with this work for additional information ~ regarding copyright ownership. The ASF licenses this file ~ to you under the Apache License, Version 2.0 (the ~ "License"); you may not use this file except in compliance ~ with the License. You may obtain a copy of the License at ~ ~ http://www.apache.org/licenses/LICENSE-2.0 ~ ~ Unless required by applicable law or agreed to in writing, software ~ distributed under the License is distributed on an "AS IS" BASIS, ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. ~ See the License for the specific language governing permissions and ~ limitations under the License. --> <html> <h2>MapElements</h2> <p> The Beam SDKs provide language-specific ways to simplify how you provide your DoFn implementation. </p> <p> <b>Kata:</b> Implement a simple map function that multiplies all input elements by 5 using <a href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.Map"> Map</a>. </p> <br> <div class="hint"> Use <a href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.Map"> Map</a> with a lambda. </div> <div class="hint"> Refer to the Beam Programming Guide <a href="https://beam.apache.org/documentation/programming-guide/#lightweight-dofns"> "Lightweight DoFns and other abstractions"</a> section for more information. </div> </html> " /> |
| <option name="feedbackLink"> |
| <FeedbackLink> |
| <option name="link" /> |
| <option name="type" value="STEPIK" /> |
| </FeedbackLink> |
| </option> |
| <option name="id" value="755579" /> |
| <option name="index" value="3" /> |
| <option name="name" value="Map" /> |
| <option name="record" value="-1" /> |
| <option name="status" value="Unchecked" /> |
| <option name="stepikChangeStatus" value="Up to date" /> |
| <option name="files"> |
| <map> |
| <entry key="task.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list> |
| <AnswerPlaceholder> |
| <option name="hints"> |
| <list /> |
| </option> |
| <option name="index" value="0" /> |
| <option name="initialState" /> |
| <option name="initializedFromDependency" value="false" /> |
| <option name="length" value="6" /> |
| <option name="offset" value="942" /> |
| <option name="placeholderDependency" /> |
| <option name="placeholderText" value="TODO()" /> |
| <option name="possibleAnswer" value="beam.Map(lambda num: num * 5)" /> |
| <option name="selected" value="false" /> |
| <option name="status" value="Unchecked" /> |
| <option name="studentAnswer" /> |
| <option name="useLength" value="false" /> |
| </AnswerPlaceholder> |
| </list> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="task.py" /> |
| <option name="text" value="# TODO: type solution here " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="true" /> |
| </TaskFile> |
| </value> |
| </entry> |
| <entry key="tests.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list /> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="tests.py" /> |
| <option name="text" value="from test_helper import run_common_tests, failed, passed, get_answer_placeholders def test_answer_placeholders(): placeholders = get_answer_placeholders() placeholder = placeholders[0] if placeholder == "": # TODO: your condition here passed() else: failed() if __name__ == '__main__': run_common_tests() # test_answer_placeholders() # TODO: uncomment test call " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="false" /> |
| </TaskFile> |
| </value> |
| </entry> |
| </map> |
| </option> |
| <option name="updateDate" value="1560937942178" /> |
| </EduTask> |
| <EduTask> |
| <option name="customPresentableName" /> |
| <option name="descriptionFormat" value="HTML" /> |
| <option name="descriptionText" value="<!-- ~ Licensed to the Apache Software Foundation (ASF) under one ~ or more contributor license agreements. See the NOTICE file ~ distributed with this work for additional information ~ regarding copyright ownership. The ASF licenses this file ~ to you under the Apache License, Version 2.0 (the ~ "License"); you may not use this file except in compliance ~ with the License. You may obtain a copy of the License at ~ ~ http://www.apache.org/licenses/LICENSE-2.0 ~ ~ Unless required by applicable law or agreed to in writing, software ~ distributed under the License is distributed on an "AS IS" BASIS, ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. ~ See the License for the specific language governing permissions and ~ limitations under the License. --> <html> <h2>FlatMapElements</h2> <p> The Beam SDKs provide language-specific ways to simplify how you provide your DoFn implementation. </p> <p> FlatMap can be used to simplify DoFn that maps an element to multiple elements (one to many). </p> <p> <b>Kata:</b> Implement a function that maps each input sentence into words tokenized by whitespace (" ") using <a href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.FlatMap"> FlatMap</a>. </p> <br> <div class="hint"> Use <a href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.FlatMap"> FlatMap</a> with a lambda. </div> <div class="hint"> Refer to the Beam Programming Guide <a href="https://beam.apache.org/documentation/programming-guide/#lightweight-dofns"> "Lightweight DoFns and other abstractions"</a> section for more information. </div> </html> " /> |
| <option name="feedbackLink"> |
| <FeedbackLink> |
| <option name="link" /> |
| <option name="type" value="STEPIK" /> |
| </FeedbackLink> |
| </option> |
| <option name="id" value="755580" /> |
| <option name="index" value="4" /> |
| <option name="name" value="FlatMap" /> |
| <option name="record" value="-1" /> |
| <option name="status" value="Unchecked" /> |
| <option name="stepikChangeStatus" value="Up to date" /> |
| <option name="files"> |
| <map> |
| <entry key="task.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list> |
| <AnswerPlaceholder> |
| <option name="hints"> |
| <list /> |
| </option> |
| <option name="index" value="0" /> |
| <option name="initialState" /> |
| <option name="initializedFromDependency" value="false" /> |
| <option name="length" value="6" /> |
| <option name="offset" value="968" /> |
| <option name="placeholderDependency" /> |
| <option name="placeholderText" value="TODO()" /> |
| <option name="possibleAnswer" value="beam.FlatMap(lambda sentence: sentence.split())" /> |
| <option name="selected" value="false" /> |
| <option name="status" value="Unchecked" /> |
| <option name="studentAnswer" /> |
| <option name="useLength" value="false" /> |
| </AnswerPlaceholder> |
| </list> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="task.py" /> |
| <option name="text" value="# TODO: type solution here " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="true" /> |
| </TaskFile> |
| </value> |
| </entry> |
| <entry key="tests.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list /> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="tests.py" /> |
| <option name="text" value="from test_helper import run_common_tests, failed, passed, get_answer_placeholders def test_answer_placeholders(): placeholders = get_answer_placeholders() placeholder = placeholders[0] if placeholder == "": # TODO: your condition here passed() else: failed() if __name__ == '__main__': run_common_tests() # test_answer_placeholders() # TODO: uncomment test call " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="false" /> |
| </TaskFile> |
| </value> |
| </entry> |
| </map> |
| </option> |
| <option name="updateDate" value="1560937944601" /> |
| </EduTask> |
| </list> |
| </option> |
| </Lesson> |
| <Lesson> |
| <option name="customPresentableName" /> |
| <option name="id" value="238428" /> |
| <option name="index" value="2" /> |
| <option name="name" value="GroupByKey" /> |
| <option name="stepikChangeStatus" value="Up to date" /> |
| <option name="updateDate" value="1560937980839" /> |
| <option name="unitId" value="210888" /> |
| <option name="items"> |
| <list> |
| <EduTask> |
| <option name="customPresentableName" /> |
| <option name="descriptionFormat" value="HTML" /> |
| <option name="descriptionText" value="<!-- ~ Licensed to the Apache Software Foundation (ASF) under one ~ or more contributor license agreements. See the NOTICE file ~ distributed with this work for additional information ~ regarding copyright ownership. The ASF licenses this file ~ to you under the Apache License, Version 2.0 (the ~ "License"); you may not use this file except in compliance ~ with the License. You may obtain a copy of the License at ~ ~ http://www.apache.org/licenses/LICENSE-2.0 ~ ~ Unless required by applicable law or agreed to in writing, software ~ distributed under the License is distributed on an "AS IS" BASIS, ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. ~ See the License for the specific language governing permissions and ~ limitations under the License. --> <html> <h2>GroupByKey</h2> <p> GroupByKey is a Beam transform for processing collections of key/value pairs. It’s a parallel reduction operation, analogous to the Shuffle phase of a Map/Shuffle/Reduce-style algorithm. The input to GroupByKey is a collection of key/value pairs that represents a multimap, where the collection contains multiple pairs that have the same key, but different values. Given such a collection, you use GroupByKey to collect all of the values associated with each unique key. </p> <p> <b>Kata:</b> Implement a <a href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.GroupByKey"> GroupByKey</a> transform that groups words by its first letter. </p> <br> <div class="hint"> Refer to <a href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.GroupByKey">GroupByKey</a> to solve this problem. </div> <div class="hint"> Refer to the Beam Programming Guide <a href="https://beam.apache.org/documentation/programming-guide/#groupbykey"> "GroupByKey"</a> section for more information. </div> </html> " /> |
| <option name="feedbackLink"> |
| <FeedbackLink> |
| <option name="link" /> |
| <option name="type" value="STEPIK" /> |
| </FeedbackLink> |
| </option> |
| <option name="id" value="755582" /> |
| <option name="index" value="1" /> |
| <option name="name" value="GroupByKey" /> |
| <option name="record" value="-1" /> |
| <option name="status" value="Unchecked" /> |
| <option name="stepikChangeStatus" value="Up to date" /> |
| <option name="files"> |
| <map> |
| <entry key="task.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list> |
| <AnswerPlaceholder> |
| <option name="hints"> |
| <list /> |
| </option> |
| <option name="index" value="0" /> |
| <option name="initialState" /> |
| <option name="initializedFromDependency" value="false" /> |
| <option name="length" value="8" /> |
| <option name="offset" value="970" /> |
| <option name="placeholderDependency" /> |
| <option name="placeholderText" value="| TODO()" /> |
| <option name="possibleAnswer" value="| beam.Map(lambda word: (word[0], word)) | beam.GroupByKey()" /> |
| <option name="selected" value="false" /> |
| <option name="status" value="Unchecked" /> |
| <option name="studentAnswer" /> |
| <option name="useLength" value="false" /> |
| </AnswerPlaceholder> |
| </list> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="task.py" /> |
| <option name="text" value="# TODO: type solution here " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="true" /> |
| </TaskFile> |
| </value> |
| </entry> |
| <entry key="tests.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list /> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="tests.py" /> |
| <option name="text" value="from test_helper import run_common_tests, failed, passed, get_answer_placeholders def test_answer_placeholders(): placeholders = get_answer_placeholders() placeholder = placeholders[0] if placeholder == "": # TODO: your condition here passed() else: failed() if __name__ == '__main__': run_common_tests() # test_answer_placeholders() # TODO: uncomment test call " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="false" /> |
| </TaskFile> |
| </value> |
| </entry> |
| </map> |
| </option> |
| <option name="updateDate" value="1560937986273" /> |
| </EduTask> |
| </list> |
| </option> |
| </Lesson> |
| <Lesson> |
| <option name="customPresentableName" /> |
| <option name="id" value="238429" /> |
| <option name="index" value="3" /> |
| <option name="name" value="CoGroupByKey" /> |
| <option name="stepikChangeStatus" value="Up to date" /> |
| <option name="updateDate" value="1560938006360" /> |
| <option name="unitId" value="210889" /> |
| <option name="items"> |
| <list> |
| <EduTask> |
| <option name="customPresentableName" /> |
| <option name="descriptionFormat" value="HTML" /> |
| <option name="descriptionText" value="<!-- ~ Licensed to the Apache Software Foundation (ASF) under one ~ or more contributor license agreements. See the NOTICE file ~ distributed with this work for additional information ~ regarding copyright ownership. The ASF licenses this file ~ to you under the Apache License, Version 2.0 (the ~ "License"); you may not use this file except in compliance ~ with the License. You may obtain a copy of the License at ~ ~ http://www.apache.org/licenses/LICENSE-2.0 ~ ~ Unless required by applicable law or agreed to in writing, software ~ distributed under the License is distributed on an "AS IS" BASIS, ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. ~ See the License for the specific language governing permissions and ~ limitations under the License. --> <html> <h2>CoGroupByKey</h2> <p> CoGroupByKey performs a relational join of two or more key/value PCollections that have the same key type. </p> <p> <b>Kata:</b> Implement a <a href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.util.html#apache_beam.transforms.util.CoGroupByKey"> CoGroupByKey</a> transform that join words by its first alphabetical letter, and then produces the string representation of the WordsAlphabet model. </p> <br> <div class="hint"> Refer to <a href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.util.html#apache_beam.transforms.util.CoGroupByKey"> CoGroupByKey</a>to solve this problem. </div> <div class="hint"> Refer to the Beam Programming Guide <a href="https://beam.apache.org/documentation/programming-guide/#cogroupbykey"> "CoGroupByKey"</a> section for more information. </div> </html> " /> |
| <option name="feedbackLink"> |
| <FeedbackLink> |
| <option name="link" /> |
| <option name="type" value="STEPIK" /> |
| </FeedbackLink> |
| </option> |
| <option name="id" value="755583" /> |
| <option name="index" value="1" /> |
| <option name="name" value="CoGroupByKey" /> |
| <option name="record" value="-1" /> |
| <option name="status" value="Unchecked" /> |
| <option name="stepikChangeStatus" value="Up to date" /> |
| <option name="files"> |
| <map> |
| <entry key="task.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list> |
| <AnswerPlaceholder> |
| <option name="hints"> |
| <list /> |
| </option> |
| <option name="index" value="0" /> |
| <option name="initialState" /> |
| <option name="initializedFromDependency" value="false" /> |
| <option name="length" value="6" /> |
| <option name="offset" value="1228" /> |
| <option name="placeholderDependency" /> |
| <option name="placeholderText" value="TODO()" /> |
| <option name="possibleAnswer" value="def map_to_alphabet_kv(word): return (word[0], word) def cogbk_result_to_wordsalphabet(cgbk_result): (alphabet, words) = cgbk_result return WordsAlphabet(alphabet, words['fruits'][0], words['countries'][0]) fruits_kv = (fruits | 'Fruit to KV' >> beam.Map(map_to_alphabet_kv)) countries_kv = (countries | 'Country to KV' >> beam.Map(map_to_alphabet_kv)) return ({'fruits': fruits_kv, 'countries': countries_kv} | beam.CoGroupByKey() | beam.Map(cogbk_result_to_wordsalphabet))" /> |
| <option name="selected" value="false" /> |
| <option name="status" value="Unchecked" /> |
| <option name="studentAnswer" /> |
| <option name="useLength" value="false" /> |
| </AnswerPlaceholder> |
| </list> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="task.py" /> |
| <option name="text" value="# TODO: type solution here " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="true" /> |
| </TaskFile> |
| </value> |
| </entry> |
| <entry key="tests.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list /> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="tests.py" /> |
| <option name="text" value="from test_helper import run_common_tests, failed, passed, get_answer_placeholders def test_answer_placeholders(): placeholders = get_answer_placeholders() placeholder = placeholders[0] if placeholder == "": # TODO: your condition here passed() else: failed() if __name__ == '__main__': run_common_tests() # test_answer_placeholders() # TODO: uncomment test call " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="false" /> |
| </TaskFile> |
| </value> |
| </entry> |
| </map> |
| </option> |
| <option name="updateDate" value="1560938011025" /> |
| </EduTask> |
| </list> |
| </option> |
| </Lesson> |
| <Lesson> |
| <option name="customPresentableName" /> |
| <option name="id" value="238430" /> |
| <option name="index" value="4" /> |
| <option name="name" value="Combine" /> |
| <option name="stepikChangeStatus" value="Content changed" /> |
| <option name="updateDate" value="1560938016807" /> |
| <option name="unitId" value="210890" /> |
| <option name="items"> |
| <list> |
| <EduTask> |
| <option name="customPresentableName" /> |
| <option name="descriptionFormat" value="HTML" /> |
| <option name="descriptionText" value="<!-- ~ Licensed to the Apache Software Foundation (ASF) under one ~ or more contributor license agreements. See the NOTICE file ~ distributed with this work for additional information ~ regarding copyright ownership. The ASF licenses this file ~ to you under the Apache License, Version 2.0 (the ~ "License"); you may not use this file except in compliance ~ with the License. You may obtain a copy of the License at ~ ~ http://www.apache.org/licenses/LICENSE-2.0 ~ ~ Unless required by applicable law or agreed to in writing, software ~ distributed under the License is distributed on an "AS IS" BASIS, ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. ~ See the License for the specific language governing permissions and ~ limitations under the License. --> <html> <h2>Combine - Simple Function</h2> <p> Combine is a Beam transform for combining collections of elements or values in your data. When you apply a Combine transform, you must provide the function that contains the logic for combining the elements or values. The combining function should be commutative and associative, as the function is not necessarily invoked exactly once on all values with a given key. Because the input data (including the value collection) may be distributed across multiple workers, the combining function might be called multiple times to perform partial combining on subsets of the value collection. </p> <p> Simple combine operations, such as sums, can usually be implemented as a simple function. </p> <p> <b>Kata:</b> Implement the summation of numbers using <a href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.CombineGlobally"> CombineGlobally</a>. </p> <br> <div class="hint"> Implement a simple Python function that performs the summation of the values. </div> <div class="hint"> Refer to the Beam Programming Guide <a href="https://beam.apache.org/documentation/programming-guide/#simple-combines"> "Simple combinations using simple functions"</a> section for more information. </div> </html> " /> |
| <option name="feedbackLink"> |
| <FeedbackLink> |
| <option name="link" /> |
| <option name="type" value="STEPIK" /> |
| </FeedbackLink> |
| </option> |
| <option name="id" value="755584" /> |
| <option name="index" value="1" /> |
| <option name="name" value="Simple Function" /> |
| <option name="record" value="-1" /> |
| <option name="status" value="Unchecked" /> |
| <option name="stepikChangeStatus" value="Up to date" /> |
| <option name="files"> |
| <map> |
| <entry key="task.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list> |
| <AnswerPlaceholder> |
| <option name="hints"> |
| <list /> |
| </option> |
| <option name="index" value="0" /> |
| <option name="initialState" /> |
| <option name="initializedFromDependency" value="false" /> |
| <option name="length" value="6" /> |
| <option name="offset" value="900" /> |
| <option name="placeholderDependency" /> |
| <option name="placeholderText" value="TODO()" /> |
| <option name="possibleAnswer" value="total = 0 for num in numbers: total += num return total" /> |
| <option name="selected" value="false" /> |
| <option name="status" value="Unchecked" /> |
| <option name="studentAnswer" /> |
| <option name="useLength" value="false" /> |
| </AnswerPlaceholder> |
| <AnswerPlaceholder> |
| <option name="hints"> |
| <list /> |
| </option> |
| <option name="index" value="1" /> |
| <option name="initialState" /> |
| <option name="initializedFromDependency" value="false" /> |
| <option name="length" value="6" /> |
| <option name="offset" value="1036" /> |
| <option name="placeholderDependency" /> |
| <option name="placeholderText" value="TODO()" /> |
| <option name="possibleAnswer" value="beam.CombineGlobally(sum)" /> |
| <option name="selected" value="false" /> |
| <option name="status" value="Unchecked" /> |
| <option name="studentAnswer" /> |
| <option name="useLength" value="false" /> |
| </AnswerPlaceholder> |
| </list> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="task.py" /> |
| <option name="text" value="# TODO: type solution here " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="true" /> |
| </TaskFile> |
| </value> |
| </entry> |
| <entry key="tests.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list /> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="tests.py" /> |
| <option name="text" value="from test_helper import run_common_tests, failed, passed, get_answer_placeholders def test_answer_placeholders(): placeholders = get_answer_placeholders() placeholder = placeholders[0] if placeholder == "": # TODO: your condition here passed() else: failed() if __name__ == '__main__': run_common_tests() # test_answer_placeholders() # TODO: uncomment test call " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="false" /> |
| </TaskFile> |
| </value> |
| </entry> |
| </map> |
| </option> |
| <option name="updateDate" value="1560938025042" /> |
| </EduTask> |
| <EduTask> |
| <option name="customPresentableName" /> |
| <option name="descriptionFormat" value="HTML" /> |
| <option name="descriptionText" value="<!-- ~ Licensed to the Apache Software Foundation (ASF) under one ~ or more contributor license agreements. See the NOTICE file ~ distributed with this work for additional information ~ regarding copyright ownership. The ASF licenses this file ~ to you under the Apache License, Version 2.0 (the ~ "License"); you may not use this file except in compliance ~ with the License. You may obtain a copy of the License at ~ ~ http://www.apache.org/licenses/LICENSE-2.0 ~ ~ Unless required by applicable law or agreed to in writing, software ~ distributed under the License is distributed on an "AS IS" BASIS, ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. ~ See the License for the specific language governing permissions and ~ limitations under the License. --> <html> <h2>Combine - CombineFn</h2> <p> Combine is a Beam transform for combining collections of elements or values in your data. When you apply a Combine transform, you must provide the function that contains the logic for combining the elements or values. The combining function should be commutative and associative, as the function is not necessarily invoked exactly once on all values with a given key. Because the input data (including the value collection) may be distributed across multiple workers, the combining function might be called multiple times to perform partial combining on subsets of the value collection. </p> <p> Complex combination operations might require you to create a subclass of CombineFn that has an accumulation type distinct from the input/output type. You should use CombineFn if the combine function requires a more sophisticated accumulator, must perform additional pre- or post-processing, might change the output type, or takes the key into account. </p> <p> <b>Kata:</b> Implement the average of numbers using <a href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.CombineFn"> Combine.CombineFn</a>. </p> <br> <div class="hint"> Extend the <a href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.CombineFn"> CombineFn</a> class that counts the average of the number. </div> <div class="hint"> Refer to the Beam Programming Guide <a href="https://beam.apache.org/documentation/programming-guide/#advanced-combines"> "Advanced combinations using CombineFn"</a> section for more information. </div> </html> " /> |
| <option name="feedbackLink"> |
| <FeedbackLink> |
| <option name="link" /> |
| <option name="type" value="STEPIK" /> |
| </FeedbackLink> |
| </option> |
| <option name="id" value="755585" /> |
| <option name="index" value="2" /> |
| <option name="name" value="CombineFn" /> |
| <option name="record" value="-1" /> |
| <option name="status" value="Unchecked" /> |
| <option name="stepikChangeStatus" value="Up to date" /> |
| <option name="files"> |
| <map> |
| <entry key="task.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list> |
| <AnswerPlaceholder> |
| <option name="hints"> |
| <list /> |
| </option> |
| <option name="index" value="0" /> |
| <option name="initialState" /> |
| <option name="initializedFromDependency" value="false" /> |
| <option name="length" value="6" /> |
| <option name="offset" value="916" /> |
| <option name="placeholderDependency" /> |
| <option name="placeholderText" value="TODO()" /> |
| <option name="possibleAnswer" value="def create_accumulator(self): return 0.0, 0 def add_input(self, accumulator, element): (sum, count) = accumulator return sum + element, count + 1 def merge_accumulators(self, accumulators): sums, counts = zip(*accumulators) return sum(sums), sum(counts) def extract_output(self, accumulator): (sum, count) = accumulator return sum / count if count else float('NaN')" /> |
| <option name="selected" value="false" /> |
| <option name="status" value="Unchecked" /> |
| <option name="studentAnswer" /> |
| <option name="useLength" value="false" /> |
| </AnswerPlaceholder> |
| <AnswerPlaceholder> |
| <option name="hints"> |
| <list /> |
| </option> |
| <option name="index" value="1" /> |
| <option name="initialState" /> |
| <option name="initializedFromDependency" value="false" /> |
| <option name="length" value="6" /> |
| <option name="offset" value="1420" /> |
| <option name="placeholderDependency" /> |
| <option name="placeholderText" value="TODO()" /> |
| <option name="possibleAnswer" value="beam.CombineGlobally(AverageFn())" /> |
| <option name="selected" value="false" /> |
| <option name="status" value="Unchecked" /> |
| <option name="studentAnswer" /> |
| <option name="useLength" value="false" /> |
| </AnswerPlaceholder> |
| </list> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="task.py" /> |
| <option name="text" value="# TODO: type solution here " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="true" /> |
| </TaskFile> |
| </value> |
| </entry> |
| <entry key="tests.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list /> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="tests.py" /> |
| <option name="text" value="from test_helper import run_common_tests, failed, passed, get_answer_placeholders def test_answer_placeholders(): placeholders = get_answer_placeholders() placeholder = placeholders[0] if placeholder == "": # TODO: your condition here passed() else: failed() if __name__ == '__main__': run_common_tests() # test_answer_placeholders() # TODO: uncomment test call " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="false" /> |
| </TaskFile> |
| </value> |
| </entry> |
| </map> |
| </option> |
| <option name="updateDate" value="1560938027519" /> |
| </EduTask> |
| <EduTask> |
| <option name="customPresentableName" /> |
| <option name="descriptionFormat" value="HTML" /> |
| <option name="descriptionText" value="<!-- ~ Licensed to the Apache Software Foundation (ASF) under one ~ or more contributor license agreements. See the NOTICE file ~ distributed with this work for additional information ~ regarding copyright ownership. The ASF licenses this file ~ to you under the Apache License, Version 2.0 (the ~ "License"); you may not use this file except in compliance ~ with the License. You may obtain a copy of the License at ~ ~ http://www.apache.org/licenses/LICENSE-2.0 ~ ~ Unless required by applicable law or agreed to in writing, software ~ distributed under the License is distributed on an "AS IS" BASIS, ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. ~ See the License for the specific language governing permissions and ~ limitations under the License. --> <html> <h2>Combine - Combine PerKey</h2> <p> After creating a keyed PCollection (for example, by using a GroupByKey transform), a common pattern is to combine the collection of values associated with each key into a single, merged value. This pattern of a GroupByKey followed by merging the collection of values is equivalent to Combine PerKey transform. The combine function you supply to Combine PerKey must be an associative reduction function or a subclass of CombineFn. </p> <p> <b>Kata:</b> Implement the sum of scores per player using <a href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.CombinePerKey"> CombinePerKey</a>. </p> <br> <div class="hint"> Use <a href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.CombinePerKey"> CombinePerKey(CombineFn)</a>. </div> <div class="hint"> Extend the <a href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.CombineFn"> CombineFn</a> class that counts the sum of the number. </div> <div class="hint"> Refer to the Beam Programming Guide <a href="https://beam.apache.org/documentation/programming-guide/#combining-values-in-a-keyed-pcollection"> "Combining values in a keyed PCollection"</a> section for more information. </div> </html> " /> |
| <option name="feedbackLink"> |
| <FeedbackLink> |
| <option name="link" /> |
| <option name="type" value="STEPIK" /> |
| </FeedbackLink> |
| </option> |
| <option name="id" value="755587" /> |
| <option name="index" value="3" /> |
| <option name="name" value="Combine PerKey" /> |
| <option name="record" value="-1" /> |
| <option name="status" value="Unchecked" /> |
| <option name="stepikChangeStatus" value="Up to date" /> |
| <option name="files"> |
| <map> |
| <entry key="task.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list> |
| <AnswerPlaceholder> |
| <option name="hints"> |
| <list /> |
| </option> |
| <option name="index" value="0" /> |
| <option name="initialState" /> |
| <option name="initializedFromDependency" value="false" /> |
| <option name="length" value="6" /> |
| <option name="offset" value="1088" /> |
| <option name="placeholderDependency" /> |
| <option name="placeholderText" value="TODO()" /> |
| <option name="possibleAnswer" value="beam.CombinePerKey(sum)" /> |
| <option name="selected" value="false" /> |
| <option name="status" value="Unchecked" /> |
| <option name="studentAnswer" /> |
| <option name="useLength" value="false" /> |
| </AnswerPlaceholder> |
| </list> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="task.py" /> |
| <option name="text" value="# TODO: type solution here " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="true" /> |
| </TaskFile> |
| </value> |
| </entry> |
| <entry key="tests.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list /> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="tests.py" /> |
| <option name="text" value="from test_helper import run_common_tests, failed, passed, get_answer_placeholders def test_answer_placeholders(): placeholders = get_answer_placeholders() placeholder = placeholders[0] if placeholder == "": # TODO: your condition here passed() else: failed() if __name__ == '__main__': run_common_tests() # test_answer_placeholders() # TODO: uncomment test call " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="false" /> |
| </TaskFile> |
| </value> |
| </entry> |
| </map> |
| </option> |
| <option name="updateDate" value="1560938030159" /> |
| </EduTask> |
| </list> |
| </option> |
| </Lesson> |
| <Lesson> |
| <option name="customPresentableName" /> |
| <option name="id" value="238431" /> |
| <option name="index" value="5" /> |
| <option name="name" value="Flatten" /> |
| <option name="stepikChangeStatus" value="Content changed" /> |
| <option name="updateDate" value="1560938036123" /> |
| <option name="unitId" value="210891" /> |
| <option name="items"> |
| <list> |
| <EduTask> |
| <option name="customPresentableName" /> |
| <option name="descriptionFormat" value="HTML" /> |
| <option name="descriptionText" value="<!-- ~ Licensed to the Apache Software Foundation (ASF) under one ~ or more contributor license agreements. See the NOTICE file ~ distributed with this work for additional information ~ regarding copyright ownership. The ASF licenses this file ~ to you under the Apache License, Version 2.0 (the ~ "License"); you may not use this file except in compliance ~ with the License. You may obtain a copy of the License at ~ ~ http://www.apache.org/licenses/LICENSE-2.0 ~ ~ Unless required by applicable law or agreed to in writing, software ~ distributed under the License is distributed on an "AS IS" BASIS, ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. ~ See the License for the specific language governing permissions and ~ limitations under the License. --> <html> <h2>Flatten</h2> <p> Flatten is a Beam transform for PCollection objects that store the same data type. Flatten merges multiple PCollection objects into a single logical PCollection. </p> <p> <b>Kata:</b> Implement a <a href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.Flatten"> Flatten</a> transform that merges two PCollection of words into a single PCollection. </p> <br> <div class="hint"> Refer to <a href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.Flatten"> Flatten</a> to solve this problem. </div> <div class="hint"> Refer to the Beam Programming Guide <a href="https://beam.apache.org/documentation/programming-guide/#flatten"> "Flatten"</a> section for more information. </div> </html> " /> |
| <option name="feedbackLink"> |
| <FeedbackLink> |
| <option name="link" /> |
| <option name="type" value="STEPIK" /> |
| </FeedbackLink> |
| </option> |
| <option name="id" value="755588" /> |
| <option name="index" value="1" /> |
| <option name="name" value="Flatten" /> |
| <option name="record" value="-1" /> |
| <option name="status" value="Unchecked" /> |
| <option name="stepikChangeStatus" value="Up to date" /> |
| <option name="files"> |
| <map> |
| <entry key="task.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list> |
| <AnswerPlaceholder> |
| <option name="hints"> |
| <list /> |
| </option> |
| <option name="index" value="0" /> |
| <option name="initialState" /> |
| <option name="initializedFromDependency" value="false" /> |
| <option name="length" value="6" /> |
| <option name="offset" value="1140" /> |
| <option name="placeholderDependency" /> |
| <option name="placeholderText" value="TODO()" /> |
| <option name="possibleAnswer" value="beam.Flatten()" /> |
| <option name="selected" value="false" /> |
| <option name="status" value="Unchecked" /> |
| <option name="studentAnswer" /> |
| <option name="useLength" value="false" /> |
| </AnswerPlaceholder> |
| </list> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="task.py" /> |
| <option name="text" value="# TODO: type solution here " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="true" /> |
| </TaskFile> |
| </value> |
| </entry> |
| <entry key="tests.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list /> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="tests.py" /> |
| <option name="text" value="from test_helper import run_common_tests, failed, passed, get_answer_placeholders def test_answer_placeholders(): placeholders = get_answer_placeholders() placeholder = placeholders[0] if placeholder == "": # TODO: your condition here passed() else: failed() if __name__ == '__main__': run_common_tests() # test_answer_placeholders() # TODO: uncomment test call " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="false" /> |
| </TaskFile> |
| </value> |
| </entry> |
| </map> |
| </option> |
| <option name="updateDate" value="1560938041998" /> |
| </EduTask> |
| </list> |
| </option> |
| </Lesson> |
| <Lesson> |
| <option name="customPresentableName" /> |
| <option name="id" value="238432" /> |
| <option name="index" value="6" /> |
| <option name="name" value="Partition" /> |
| <option name="stepikChangeStatus" value="Content changed" /> |
| <option name="updateDate" value="1560938052303" /> |
| <option name="unitId" value="210892" /> |
| <option name="items"> |
| <list> |
| <EduTask> |
| <option name="customPresentableName" /> |
| <option name="descriptionFormat" value="HTML" /> |
| <option name="descriptionText" value="<!-- ~ Licensed to the Apache Software Foundation (ASF) under one ~ or more contributor license agreements. See the NOTICE file ~ distributed with this work for additional information ~ regarding copyright ownership. The ASF licenses this file ~ to you under the Apache License, Version 2.0 (the ~ "License"); you may not use this file except in compliance ~ with the License. You may obtain a copy of the License at ~ ~ http://www.apache.org/licenses/LICENSE-2.0 ~ ~ Unless required by applicable law or agreed to in writing, software ~ distributed under the License is distributed on an "AS IS" BASIS, ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. ~ See the License for the specific language governing permissions and ~ limitations under the License. --> <html> <h2>Partition</h2> <p> Partition is a Beam transform for PCollection objects that store the same data type. Partition splits a single PCollection into a fixed number of smaller collections. </p> <p> Partition divides the elements of a PCollection according to a partitioning function that you provide. The partitioning function contains the logic that determines how to split up the elements of the input PCollection into each resulting partition PCollection. </p> <p> <b>Kata:</b> Implement a <a href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.Partition"> Partition</a> transform that splits a PCollection of numbers into two PCollections. The first PCollection contains numbers greater than 100, and the second PCollection contains the remaining numbers. </p> <br> <div class="hint"> Refer to <a href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.Partition"> Partition</a> to solve this problem. </div> <div class="hint"> Refer to the Beam Programming Guide <a href="https://beam.apache.org/documentation/programming-guide/#partition"> "Partition"</a> section for more information. </div> </html> " /> |
| <option name="feedbackLink"> |
| <FeedbackLink> |
| <option name="link" /> |
| <option name="type" value="STEPIK" /> |
| </FeedbackLink> |
| </option> |
| <option name="id" value="755589" /> |
| <option name="index" value="1" /> |
| <option name="name" value="Partition" /> |
| <option name="record" value="-1" /> |
| <option name="status" value="Unchecked" /> |
| <option name="stepikChangeStatus" value="Up to date" /> |
| <option name="files"> |
| <map> |
| <entry key="task.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list> |
| <AnswerPlaceholder> |
| <option name="hints"> |
| <list /> |
| </option> |
| <option name="index" value="0" /> |
| <option name="initialState" /> |
| <option name="initializedFromDependency" value="false" /> |
| <option name="length" value="6" /> |
| <option name="offset" value="924" /> |
| <option name="placeholderDependency" /> |
| <option name="placeholderText" value="TODO()" /> |
| <option name="possibleAnswer" value="if number > 100: return 0 else: return 1" /> |
| <option name="selected" value="false" /> |
| <option name="status" value="Unchecked" /> |
| <option name="studentAnswer" /> |
| <option name="useLength" value="false" /> |
| </AnswerPlaceholder> |
| <AnswerPlaceholder> |
| <option name="hints"> |
| <list /> |
| </option> |
| <option name="index" value="1" /> |
| <option name="initialState" /> |
| <option name="initializedFromDependency" value="false" /> |
| <option name="length" value="6" /> |
| <option name="offset" value="1087" /> |
| <option name="placeholderDependency" /> |
| <option name="placeholderText" value="TODO()" /> |
| <option name="possibleAnswer" value="beam.Partition(partition_fn, 2)" /> |
| <option name="selected" value="false" /> |
| <option name="status" value="Unchecked" /> |
| <option name="studentAnswer" /> |
| <option name="useLength" value="false" /> |
| </AnswerPlaceholder> |
| </list> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="task.py" /> |
| <option name="text" value="# TODO: type solution here " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="true" /> |
| </TaskFile> |
| </value> |
| </entry> |
| <entry key="tests.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list /> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="tests.py" /> |
| <option name="text" value="from test_helper import run_common_tests, failed, passed, get_answer_placeholders def test_answer_placeholders(): placeholders = get_answer_placeholders() placeholder = placeholders[0] if placeholder == "": # TODO: your condition here passed() else: failed() if __name__ == '__main__': run_common_tests() # test_answer_placeholders() # TODO: uncomment test call " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="false" /> |
| </TaskFile> |
| </value> |
| </entry> |
| </map> |
| </option> |
| <option name="updateDate" value="1560938058938" /> |
| </EduTask> |
| </list> |
| </option> |
| </Lesson> |
| <Lesson> |
| <option name="customPresentableName" /> |
| <option name="id" value="238433" /> |
| <option name="index" value="7" /> |
| <option name="name" value="Side Input" /> |
| <option name="stepikChangeStatus" value="Content changed" /> |
| <option name="updateDate" value="1560938065022" /> |
| <option name="unitId" value="210893" /> |
| <option name="items"> |
| <list> |
| <EduTask> |
| <option name="customPresentableName" /> |
| <option name="descriptionFormat" value="HTML" /> |
| <option name="descriptionText" value="<!-- ~ Licensed to the Apache Software Foundation (ASF) under one ~ or more contributor license agreements. See the NOTICE file ~ distributed with this work for additional information ~ regarding copyright ownership. The ASF licenses this file ~ to you under the Apache License, Version 2.0 (the ~ "License"); you may not use this file except in compliance ~ with the License. You may obtain a copy of the License at ~ ~ http://www.apache.org/licenses/LICENSE-2.0 ~ ~ Unless required by applicable law or agreed to in writing, software ~ distributed under the License is distributed on an "AS IS" BASIS, ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. ~ See the License for the specific language governing permissions and ~ limitations under the License. --> <html> <h2>Side Input</h2> <p> In addition to the main input PCollection, you can provide additional inputs to a ParDo transform in the form of side inputs. A side input is an additional input that your DoFn can access each time it processes an element in the input PCollection. When you specify a side input, you create a view of some other data that can be read from within the ParDo transform’s DoFn while processing each element. </p> <p> Side inputs are useful if your ParDo needs to inject additional data when processing each element in the input PCollection, but the additional data needs to be determined at runtime (and not hard-coded). Such values might be determined by the input data, or depend on a different branch of your pipeline. </p> <p> <b>Kata:</b> Please enrich each Person with the country based on the city he/she lives in. </p> <br> <div class="hint"> Override <a href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.DoFn.process"> process</a> method that also accepts side input argument. </div> <div class="hint"> Use <a href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.ParDo"> ParDo</a> with <a href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.DoFn"> DoFn</a> that accepts side input. </div> <div class="hint"> Refer to the Beam Programming Guide <a href="https://beam.apache.org/documentation/programming-guide/#side-inputs">"Side inputs"</a> section for more information. </div> </html> " /> |
| <option name="feedbackLink"> |
| <FeedbackLink> |
| <option name="link" /> |
| <option name="type" value="STEPIK" /> |
| </FeedbackLink> |
| </option> |
| <option name="id" value="755590" /> |
| <option name="index" value="1" /> |
| <option name="name" value="Side Input" /> |
| <option name="record" value="-1" /> |
| <option name="status" value="Unchecked" /> |
| <option name="stepikChangeStatus" value="Up to date" /> |
| <option name="files"> |
| <map> |
| <entry key="task.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list> |
| <AnswerPlaceholder> |
| <option name="hints"> |
| <list /> |
| </option> |
| <option name="index" value="0" /> |
| <option name="initialState" /> |
| <option name="initializedFromDependency" value="false" /> |
| <option name="length" value="6" /> |
| <option name="offset" value="1534" /> |
| <option name="placeholderDependency" /> |
| <option name="placeholderText" value="TODO()" /> |
| <option name="possibleAnswer" value="def process(self, element, cities_to_countries): yield Person(element.name, element.city, cities_to_countries[element.city])" /> |
| <option name="selected" value="false" /> |
| <option name="status" value="Unchecked" /> |
| <option name="studentAnswer" /> |
| <option name="useLength" value="false" /> |
| </AnswerPlaceholder> |
| <AnswerPlaceholder> |
| <option name="hints"> |
| <list /> |
| </option> |
| <option name="index" value="1" /> |
| <option name="initialState" /> |
| <option name="initializedFromDependency" value="false" /> |
| <option name="length" value="6" /> |
| <option name="offset" value="2096" /> |
| <option name="placeholderDependency" /> |
| <option name="placeholderText" value="TODO()" /> |
| <option name="possibleAnswer" value="beam.ParDo(EnrichCountryDoFn(), cities_to_countries)" /> |
| <option name="selected" value="false" /> |
| <option name="status" value="Unchecked" /> |
| <option name="studentAnswer" /> |
| <option name="useLength" value="false" /> |
| </AnswerPlaceholder> |
| </list> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="task.py" /> |
| <option name="text" value="# TODO: type solution here " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="true" /> |
| </TaskFile> |
| </value> |
| </entry> |
| <entry key="tests.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list /> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="tests.py" /> |
| <option name="text" value="from test_helper import run_common_tests, failed, passed, get_answer_placeholders def test_answer_placeholders(): placeholders = get_answer_placeholders() placeholder = placeholders[0] if placeholder == "": # TODO: your condition here passed() else: failed() if __name__ == '__main__': run_common_tests() # test_answer_placeholders() # TODO: uncomment test call " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="false" /> |
| </TaskFile> |
| </value> |
| </entry> |
| </map> |
| </option> |
| <option name="updateDate" value="1560938069904" /> |
| </EduTask> |
| </list> |
| </option> |
| </Lesson> |
| <Lesson> |
| <option name="customPresentableName" /> |
| <option name="id" value="238434" /> |
| <option name="index" value="8" /> |
| <option name="name" value="Side Output" /> |
| <option name="stepikChangeStatus" value="Content changed" /> |
| <option name="updateDate" value="1560938076976" /> |
| <option name="unitId" value="210894" /> |
| <option name="items"> |
| <list> |
| <EduTask> |
| <option name="customPresentableName" /> |
| <option name="descriptionFormat" value="HTML" /> |
| <option name="descriptionText" value="<!-- ~ Licensed to the Apache Software Foundation (ASF) under one ~ or more contributor license agreements. See the NOTICE file ~ distributed with this work for additional information ~ regarding copyright ownership. The ASF licenses this file ~ to you under the Apache License, Version 2.0 (the ~ "License"); you may not use this file except in compliance ~ with the License. You may obtain a copy of the License at ~ ~ http://www.apache.org/licenses/LICENSE-2.0 ~ ~ Unless required by applicable law or agreed to in writing, software ~ distributed under the License is distributed on an "AS IS" BASIS, ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. ~ See the License for the specific language governing permissions and ~ limitations under the License. --> <html> <h2>Side Output</h2> <p> While ParDo always produces a main output PCollection (as the return value from apply), you can also have your ParDo produce any number of additional output PCollections. If you choose to have multiple outputs, your ParDo returns all of the output PCollections (including the main output) bundled together. </p> <p> <b>Kata:</b> Implement additional output to your ParDo for numbers bigger than 100. </p> <br> <div class="hint"> Use <a href="https://beam.apache.org/releases/pydoc/current/apache_beam.pvalue.html#apache_beam.pvalue.TaggedOutput"> pvalue.TaggedOutput</a> and <a href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.ParDo.with_outputs"> .with_outputs</a> to output multiple tagged-outputs in a <a href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.ParDo"> ParDo.</a> </div> <div class="hint"> Refer to the Beam Programming Guide <a href="https://beam.apache.org/documentation/programming-guide/#additional-outputs"> "Additional outputs"</a> section for more information. </div> </html> " /> |
| <option name="feedbackLink"> |
| <FeedbackLink> |
| <option name="link" /> |
| <option name="type" value="STEPIK" /> |
| </FeedbackLink> |
| </option> |
| <option name="id" value="755591" /> |
| <option name="index" value="1" /> |
| <option name="name" value="Side Output" /> |
| <option name="record" value="-1" /> |
| <option name="status" value="Unchecked" /> |
| <option name="stepikChangeStatus" value="Up to date" /> |
| <option name="files"> |
| <map> |
| <entry key="task.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list> |
| <AnswerPlaceholder> |
| <option name="hints"> |
| <list /> |
| </option> |
| <option name="index" value="0" /> |
| <option name="initialState" /> |
| <option name="initializedFromDependency" value="false" /> |
| <option name="length" value="6" /> |
| <option name="offset" value="1011" /> |
| <option name="placeholderDependency" /> |
| <option name="placeholderText" value="TODO()" /> |
| <option name="possibleAnswer" value="def process(self, element): if element <= 100: yield element else: yield pvalue.TaggedOutput(num_above_100_tag, element)" /> |
| <option name="selected" value="false" /> |
| <option name="status" value="Unchecked" /> |
| <option name="studentAnswer" /> |
| <option name="useLength" value="false" /> |
| </AnswerPlaceholder> |
| <AnswerPlaceholder> |
| <option name="hints"> |
| <list /> |
| </option> |
| <option name="index" value="1" /> |
| <option name="initialState" /> |
| <option name="initializedFromDependency" value="false" /> |
| <option name="length" value="6" /> |
| <option name="offset" value="1264" /> |
| <option name="placeholderDependency" /> |
| <option name="placeholderText" value="TODO()" /> |
| <option name="possibleAnswer" value="beam.ParDo(ProcessNumbersDoFn()) .with_outputs(num_above_100_tag, main=num_below_100_tag))" /> |
| <option name="selected" value="false" /> |
| <option name="status" value="Unchecked" /> |
| <option name="studentAnswer" /> |
| <option name="useLength" value="false" /> |
| </AnswerPlaceholder> |
| </list> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="task.py" /> |
| <option name="text" value="# TODO: type solution here " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="true" /> |
| </TaskFile> |
| </value> |
| </entry> |
| <entry key="tests.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list /> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="tests.py" /> |
| <option name="text" value="from test_helper import run_common_tests, failed, passed, get_answer_placeholders def test_answer_placeholders(): placeholders = get_answer_placeholders() placeholder = placeholders[0] if placeholder == "": # TODO: your condition here passed() else: failed() if __name__ == '__main__': run_common_tests() # test_answer_placeholders() # TODO: uncomment test call " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="false" /> |
| </TaskFile> |
| </value> |
| </entry> |
| </map> |
| </option> |
| <option name="updateDate" value="1560938083234" /> |
| </EduTask> |
| </list> |
| </option> |
| </Lesson> |
| <Lesson> |
| <option name="customPresentableName" /> |
| <option name="id" value="238435" /> |
| <option name="index" value="9" /> |
| <option name="name" value="Branching" /> |
| <option name="stepikChangeStatus" value="Content changed" /> |
| <option name="updateDate" value="1560938090650" /> |
| <option name="unitId" value="210895" /> |
| <option name="items"> |
| <list> |
| <EduTask> |
| <option name="customPresentableName" /> |
| <option name="descriptionFormat" value="HTML" /> |
| <option name="descriptionText" value="<!-- ~ Licensed to the Apache Software Foundation (ASF) under one ~ or more contributor license agreements. See the NOTICE file ~ distributed with this work for additional information ~ regarding copyright ownership. The ASF licenses this file ~ to you under the Apache License, Version 2.0 (the ~ "License"); you may not use this file except in compliance ~ with the License. You may obtain a copy of the License at ~ ~ http://www.apache.org/licenses/LICENSE-2.0 ~ ~ Unless required by applicable law or agreed to in writing, software ~ distributed under the License is distributed on an "AS IS" BASIS, ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. ~ See the License for the specific language governing permissions and ~ limitations under the License. --> <html> <h2>Branching</h2> <p> You can use the same PCollection as input for multiple transforms without consuming the input or altering it. </p> <p> <b>Kata:</b> Branch out the numbers to two different transforms: one transform is multiplying each number by 5 and the other transform is multiplying each number by 10. </p> <br> <div class="hint"> Refer to the Beam Design Your Pipeline Guide <a href="https://beam.apache.org/documentation/pipelines/design-your-pipeline/#multiple-transforms-process-the-same-pcollection"> "Multiple transforms process the same PCollection"</a> section for more information. </div> </html> " /> |
| <option name="feedbackLink"> |
| <FeedbackLink> |
| <option name="link" /> |
| <option name="type" value="STEPIK" /> |
| </FeedbackLink> |
| </option> |
| <option name="id" value="755592" /> |
| <option name="index" value="1" /> |
| <option name="name" value="Branching" /> |
| <option name="record" value="-1" /> |
| <option name="status" value="Unchecked" /> |
| <option name="stepikChangeStatus" value="Up to date" /> |
| <option name="files"> |
| <map> |
| <entry key="task.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list> |
| <AnswerPlaceholder> |
| <option name="hints"> |
| <list /> |
| </option> |
| <option name="index" value="0" /> |
| <option name="initialState" /> |
| <option name="initializedFromDependency" value="false" /> |
| <option name="length" value="6" /> |
| <option name="offset" value="945" /> |
| <option name="placeholderDependency" /> |
| <option name="placeholderText" value="TODO()" /> |
| <option name="possibleAnswer" value="numbers | beam.Map(lambda num: num * 5)" /> |
| <option name="selected" value="false" /> |
| <option name="status" value="Unchecked" /> |
| <option name="studentAnswer" /> |
| <option name="useLength" value="false" /> |
| </AnswerPlaceholder> |
| <AnswerPlaceholder> |
| <option name="hints"> |
| <list /> |
| </option> |
| <option name="index" value="1" /> |
| <option name="initialState" /> |
| <option name="initializedFromDependency" value="false" /> |
| <option name="length" value="6" /> |
| <option name="offset" value="1002" /> |
| <option name="placeholderDependency" /> |
| <option name="placeholderText" value="TODO()" /> |
| <option name="possibleAnswer" value="numbers | beam.Map(lambda num: num * 10)" /> |
| <option name="selected" value="false" /> |
| <option name="status" value="Unchecked" /> |
| <option name="studentAnswer" /> |
| <option name="useLength" value="false" /> |
| </AnswerPlaceholder> |
| </list> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="task.py" /> |
| <option name="text" value="# TODO: type solution here " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="true" /> |
| </TaskFile> |
| </value> |
| </entry> |
| <entry key="tests.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list /> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="tests.py" /> |
| <option name="text" value="from test_helper import run_common_tests, failed, passed, get_answer_placeholders def test_answer_placeholders(): placeholders = get_answer_placeholders() placeholder = placeholders[0] if placeholder == "": # TODO: your condition here passed() else: failed() if __name__ == '__main__': run_common_tests() # test_answer_placeholders() # TODO: uncomment test call " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="false" /> |
| </TaskFile> |
| </value> |
| </entry> |
| </map> |
| </option> |
| <option name="updateDate" value="1560938095634" /> |
| </EduTask> |
| </list> |
| </option> |
| </Lesson> |
| <Lesson> |
| <option name="customPresentableName" /> |
| <option name="id" value="238436" /> |
| <option name="index" value="10" /> |
| <option name="name" value="Composite Transform" /> |
| <option name="stepikChangeStatus" value="Content changed" /> |
| <option name="updateDate" value="1560938102699" /> |
| <option name="unitId" value="210896" /> |
| <option name="items"> |
| <list> |
| <EduTask> |
| <option name="customPresentableName" /> |
| <option name="descriptionFormat" value="HTML" /> |
| <option name="descriptionText" value="<!-- ~ Licensed to the Apache Software Foundation (ASF) under one ~ or more contributor license agreements. See the NOTICE file ~ distributed with this work for additional information ~ regarding copyright ownership. The ASF licenses this file ~ to you under the Apache License, Version 2.0 (the ~ "License"); you may not use this file except in compliance ~ with the License. You may obtain a copy of the License at ~ ~ http://www.apache.org/licenses/LICENSE-2.0 ~ ~ Unless required by applicable law or agreed to in writing, software ~ distributed under the License is distributed on an "AS IS" BASIS, ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. ~ See the License for the specific language governing permissions and ~ limitations under the License. --> <html> <h2>Composite Transform</h2> <p> Transforms can have a nested structure, where a complex transform performs multiple simpler transforms (such as more than one ParDo, Combine, GroupByKey, or even other composite transforms). These transforms are called composite transforms. Nesting multiple transforms inside a single composite transform can make your code more modular and easier to understand. </p> <p> To create your own composite transform, create a subclass of the PTransform class and override the expand method to specify the actual processing logic. You can then use this transform just as you would a built-in transform from the Beam SDK. Within your PTransform subclass, you’ll need to override the expand method. The expand method is where you add the processing logic for the PTransform. Your override of expand must accept the appropriate type of input PCollection as a parameter, and specify the output PCollection as the return value. </p> <p> <b>Kata:</b> Please implement a composite transform "ExtractAndMultiplyNumbers" that extracts numbers from comma separated line and then multiplies each number by 10. </p> <br> <div class="hint"> Refer to <a href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.ptransform.html#apache_beam.transforms.ptransform.PTransform"> PTransform</a>. </div> <div class="hint"> Refer to the Beam Programming Guide <a href="https://beam.apache.org/documentation/programming-guide/#composite-transforms"> "Composite transforms"</a> section for more information. </div> </html> " /> |
| <option name="feedbackLink"> |
| <FeedbackLink> |
| <option name="link" /> |
| <option name="type" value="STEPIK" /> |
| </FeedbackLink> |
| </option> |
| <option name="id" value="755593" /> |
| <option name="index" value="1" /> |
| <option name="name" value="Composite Transform" /> |
| <option name="record" value="-1" /> |
| <option name="status" value="Unchecked" /> |
| <option name="stepikChangeStatus" value="Up to date" /> |
| <option name="files"> |
| <map> |
| <entry key="task.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list> |
| <AnswerPlaceholder> |
| <option name="hints"> |
| <list /> |
| </option> |
| <option name="index" value="0" /> |
| <option name="initialState" /> |
| <option name="initializedFromDependency" value="false" /> |
| <option name="length" value="6" /> |
| <option name="offset" value="920" /> |
| <option name="placeholderDependency" /> |
| <option name="placeholderText" value="TODO()" /> |
| <option name="possibleAnswer" value="def expand(self, pcoll): return (pcoll | beam.FlatMap(lambda line: map(int, line.split(','))) | beam.Map(lambda num: num * 10) )" /> |
| <option name="selected" value="false" /> |
| <option name="status" value="Unchecked" /> |
| <option name="studentAnswer" /> |
| <option name="useLength" value="false" /> |
| </AnswerPlaceholder> |
| <AnswerPlaceholder> |
| <option name="hints"> |
| <list /> |
| </option> |
| <option name="index" value="1" /> |
| <option name="initialState" /> |
| <option name="initializedFromDependency" value="false" /> |
| <option name="length" value="6" /> |
| <option name="offset" value="1179" /> |
| <option name="placeholderDependency" /> |
| <option name="placeholderText" value="TODO()" /> |
| <option name="possibleAnswer" value="ExtractAndMultiplyNumbers()" /> |
| <option name="selected" value="false" /> |
| <option name="status" value="Unchecked" /> |
| <option name="studentAnswer" /> |
| <option name="useLength" value="false" /> |
| </AnswerPlaceholder> |
| </list> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="task.py" /> |
| <option name="text" value="# TODO: type solution here " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="true" /> |
| </TaskFile> |
| </value> |
| </entry> |
| <entry key="tests.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list /> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="tests.py" /> |
| <option name="text" value="from test_helper import run_common_tests, failed, passed, get_answer_placeholders def test_answer_placeholders(): placeholders = get_answer_placeholders() placeholder = placeholders[0] if placeholder == "": # TODO: your condition here passed() else: failed() if __name__ == '__main__': run_common_tests() # test_answer_placeholders() # TODO: uncomment test call " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="false" /> |
| </TaskFile> |
| </value> |
| </entry> |
| </map> |
| </option> |
| <option name="updateDate" value="1560938107880" /> |
| </EduTask> |
| </list> |
| </option> |
| </Lesson> |
| </list> |
| </option> |
| </Section> |
| <Section> |
| <option name="courseId" value="54532" /> |
| <option name="customPresentableName" /> |
| <option name="id" value="85646" /> |
| <option name="index" value="3" /> |
| <option name="name" value="Common Transforms" /> |
| <option name="position" value="0" /> |
| <option name="stepikChangeStatus" value="Up to date" /> |
| <option name="updateDate" value="1560431009000" /> |
| <option name="items"> |
| <list> |
| <Lesson> |
| <option name="customPresentableName" /> |
| <option name="id" value="238437" /> |
| <option name="index" value="1" /> |
| <option name="name" value="Filter" /> |
| <option name="stepikChangeStatus" value="Content changed" /> |
| <option name="updateDate" value="1560938208485" /> |
| <option name="unitId" value="210897" /> |
| <option name="items"> |
| <list> |
| <EduTask> |
| <option name="customPresentableName" /> |
| <option name="descriptionFormat" value="HTML" /> |
| <option name="descriptionText" value="<!-- ~ Licensed to the Apache Software Foundation (ASF) under one ~ or more contributor license agreements. See the NOTICE file ~ distributed with this work for additional information ~ regarding copyright ownership. The ASF licenses this file ~ to you under the Apache License, Version 2.0 (the ~ "License"); you may not use this file except in compliance ~ with the License. You may obtain a copy of the License at ~ ~ http://www.apache.org/licenses/LICENSE-2.0 ~ ~ Unless required by applicable law or agreed to in writing, software ~ distributed under the License is distributed on an "AS IS" BASIS, ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. ~ See the License for the specific language governing permissions and ~ limitations under the License. --> <html> <h2>Filter using ParDo</h2> <p> <b>Kata:</b> Implement a filter function that filters out the even numbers by using <a href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.ParDo"> ParDo</a>. </p> <br> <div class="hint"> Override <a href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.DoFn.process"> process</a> method. You can use "yield" for each intended element. </div> </html> " /> |
| <option name="feedbackLink"> |
| <FeedbackLink> |
| <option name="link" /> |
| <option name="type" value="STEPIK" /> |
| </FeedbackLink> |
| </option> |
| <option name="id" value="755595" /> |
| <option name="index" value="1" /> |
| <option name="name" value="ParDo" /> |
| <option name="record" value="-1" /> |
| <option name="status" value="Unchecked" /> |
| <option name="stepikChangeStatus" value="Up to date" /> |
| <option name="files"> |
| <map> |
| <entry key="task.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list> |
| <AnswerPlaceholder> |
| <option name="hints"> |
| <list /> |
| </option> |
| <option name="index" value="0" /> |
| <option name="initialState" /> |
| <option name="initializedFromDependency" value="false" /> |
| <option name="length" value="6" /> |
| <option name="offset" value="942" /> |
| <option name="placeholderDependency" /> |
| <option name="placeholderText" value="TODO()" /> |
| <option name="possibleAnswer" value="def process(self, element): if element % 2 == 1: yield element" /> |
| <option name="selected" value="false" /> |
| <option name="status" value="Unchecked" /> |
| <option name="studentAnswer" /> |
| <option name="useLength" value="false" /> |
| </AnswerPlaceholder> |
| </list> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="task.py" /> |
| <option name="text" value="# TODO: type solution here " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="true" /> |
| </TaskFile> |
| </value> |
| </entry> |
| <entry key="tests.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list /> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="tests.py" /> |
| <option name="text" value="from test_helper import run_common_tests, failed, passed, get_answer_placeholders def test_answer_placeholders(): placeholders = get_answer_placeholders() placeholder = placeholders[0] if placeholder == "": # TODO: your condition here passed() else: failed() if __name__ == '__main__': run_common_tests() # test_answer_placeholders() # TODO: uncomment test call " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="false" /> |
| </TaskFile> |
| </value> |
| </entry> |
| </map> |
| </option> |
| <option name="updateDate" value="1560938213611" /> |
| </EduTask> |
| <EduTask> |
| <option name="customPresentableName" /> |
| <option name="descriptionFormat" value="HTML" /> |
| <option name="descriptionText" value="<!-- ~ Licensed to the Apache Software Foundation (ASF) under one ~ or more contributor license agreements. See the NOTICE file ~ distributed with this work for additional information ~ regarding copyright ownership. The ASF licenses this file ~ to you under the Apache License, Version 2.0 (the ~ "License"); you may not use this file except in compliance ~ with the License. You may obtain a copy of the License at ~ ~ http://www.apache.org/licenses/LICENSE-2.0 ~ ~ Unless required by applicable law or agreed to in writing, software ~ distributed under the License is distributed on an "AS IS" BASIS, ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. ~ See the License for the specific language governing permissions and ~ limitations under the License. --> <html> <h2>Filter</h2> <p> The Beam SDKs provide language-specific ways to simplify how you provide your DoFn implementation. </p> <p> <b>Kata:</b> Implement a filter function that filters out the odd numbers by using <a href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.Filter"> Filter</a>. </p> <br> <div class="hint"> Use <a href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.Filter"> Filter</a> with a lambda. </div> </html> " /> |
| <option name="feedbackLink"> |
| <FeedbackLink> |
| <option name="link" /> |
| <option name="type" value="STEPIK" /> |
| </FeedbackLink> |
| </option> |
| <option name="id" value="755596" /> |
| <option name="index" value="2" /> |
| <option name="name" value="Filter" /> |
| <option name="record" value="-1" /> |
| <option name="status" value="Unchecked" /> |
| <option name="stepikChangeStatus" value="Up to date" /> |
| <option name="files"> |
| <map> |
| <entry key="task.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list> |
| <AnswerPlaceholder> |
| <option name="hints"> |
| <list /> |
| </option> |
| <option name="index" value="0" /> |
| <option name="initialState" /> |
| <option name="initializedFromDependency" value="false" /> |
| <option name="length" value="6" /> |
| <option name="offset" value="934" /> |
| <option name="placeholderDependency" /> |
| <option name="placeholderText" value="TODO()" /> |
| <option name="possibleAnswer" value="beam.Filter(lambda num: num % 2 == 0)" /> |
| <option name="selected" value="false" /> |
| <option name="status" value="Unchecked" /> |
| <option name="studentAnswer" /> |
| <option name="useLength" value="false" /> |
| </AnswerPlaceholder> |
| </list> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="task.py" /> |
| <option name="text" value="# TODO: type solution here " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="true" /> |
| </TaskFile> |
| </value> |
| </entry> |
| <entry key="tests.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list /> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="tests.py" /> |
| <option name="text" value="from test_helper import run_common_tests, failed, passed, get_answer_placeholders def test_answer_placeholders(): placeholders = get_answer_placeholders() placeholder = placeholders[0] if placeholder == "": # TODO: your condition here passed() else: failed() if __name__ == '__main__': run_common_tests() # test_answer_placeholders() # TODO: uncomment test call " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="false" /> |
| </TaskFile> |
| </value> |
| </entry> |
| </map> |
| </option> |
| <option name="updateDate" value="1560938217127" /> |
| </EduTask> |
| </list> |
| </option> |
| </Lesson> |
| <Lesson> |
| <option name="customPresentableName" /> |
| <option name="id" value="238438" /> |
| <option name="index" value="2" /> |
| <option name="name" value="Aggregation" /> |
| <option name="stepikChangeStatus" value="Content changed" /> |
| <option name="updateDate" value="1560938223924" /> |
| <option name="unitId" value="210898" /> |
| <option name="items"> |
| <list> |
| <EduTask> |
| <option name="customPresentableName" /> |
| <option name="descriptionFormat" value="HTML" /> |
| <option name="descriptionText" value="<!-- ~ Licensed to the Apache Software Foundation (ASF) under one ~ or more contributor license agreements. See the NOTICE file ~ distributed with this work for additional information ~ regarding copyright ownership. The ASF licenses this file ~ to you under the Apache License, Version 2.0 (the ~ "License"); you may not use this file except in compliance ~ with the License. You may obtain a copy of the License at ~ ~ http://www.apache.org/licenses/LICENSE-2.0 ~ ~ Unless required by applicable law or agreed to in writing, software ~ distributed under the License is distributed on an "AS IS" BASIS, ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. ~ See the License for the specific language governing permissions and ~ limitations under the License. --> <html> <h2>Aggregation - Count</h2> <p> <b>Kata:</b> Count the number of elements from an input. </p> <br> <div class="hint"> Use <a href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.combiners.html#apache_beam.transforms.combiners.Count"> Count</a>. </div> </html> " /> |
| <option name="feedbackLink"> |
| <FeedbackLink> |
| <option name="link" /> |
| <option name="type" value="STEPIK" /> |
| </FeedbackLink> |
| </option> |
| <option name="id" value="755597" /> |
| <option name="index" value="1" /> |
| <option name="name" value="Count" /> |
| <option name="record" value="-1" /> |
| <option name="status" value="Unchecked" /> |
| <option name="stepikChangeStatus" value="Up to date" /> |
| <option name="files"> |
| <map> |
| <entry key="task.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list> |
| <AnswerPlaceholder> |
| <option name="hints"> |
| <list /> |
| </option> |
| <option name="index" value="0" /> |
| <option name="initialState" /> |
| <option name="initializedFromDependency" value="false" /> |
| <option name="length" value="6" /> |
| <option name="offset" value="934" /> |
| <option name="placeholderDependency" /> |
| <option name="placeholderText" value="TODO()" /> |
| <option name="possibleAnswer" value="beam.combiners.Count.Globally()" /> |
| <option name="selected" value="false" /> |
| <option name="status" value="Unchecked" /> |
| <option name="studentAnswer" /> |
| <option name="useLength" value="false" /> |
| </AnswerPlaceholder> |
| </list> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="task.py" /> |
| <option name="text" value="# TODO: type solution here " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="true" /> |
| </TaskFile> |
| </value> |
| </entry> |
| <entry key="tests.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list /> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="tests.py" /> |
| <option name="text" value="from test_helper import run_common_tests, failed, passed, get_answer_placeholders def test_answer_placeholders(): placeholders = get_answer_placeholders() placeholder = placeholders[0] if placeholder == "": # TODO: your condition here passed() else: failed() if __name__ == '__main__': run_common_tests() # test_answer_placeholders() # TODO: uncomment test call " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="false" /> |
| </TaskFile> |
| </value> |
| </entry> |
| </map> |
| </option> |
| <option name="updateDate" value="1560938230679" /> |
| </EduTask> |
| <EduTask> |
| <option name="customPresentableName" /> |
| <option name="descriptionFormat" value="HTML" /> |
| <option name="descriptionText" value="<!-- ~ Licensed to the Apache Software Foundation (ASF) under one ~ or more contributor license agreements. See the NOTICE file ~ distributed with this work for additional information ~ regarding copyright ownership. The ASF licenses this file ~ to you under the Apache License, Version 2.0 (the ~ "License"); you may not use this file except in compliance ~ with the License. You may obtain a copy of the License at ~ ~ http://www.apache.org/licenses/LICENSE-2.0 ~ ~ Unless required by applicable law or agreed to in writing, software ~ distributed under the License is distributed on an "AS IS" BASIS, ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. ~ See the License for the specific language governing permissions and ~ limitations under the License. --> <html> <h2>Aggregation - Sum</h2> <p> <b>Kata:</b> Compute the sum of all elements from an input. </p> <br> <div class="hint"> Use <a href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.CombineGlobally"> CombineGlobally</a> and Python built-in <a href="https://docs.python.org/2/library/functions.html#sum">sum</a>. </div> </html> " /> |
| <option name="feedbackLink"> |
| <FeedbackLink> |
| <option name="link" /> |
| <option name="type" value="STEPIK" /> |
| </FeedbackLink> |
| </option> |
| <option name="id" value="755598" /> |
| <option name="index" value="2" /> |
| <option name="name" value="Sum" /> |
| <option name="record" value="-1" /> |
| <option name="status" value="Unchecked" /> |
| <option name="stepikChangeStatus" value="Up to date" /> |
| <option name="files"> |
| <map> |
| <entry key="task.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list> |
| <AnswerPlaceholder> |
| <option name="hints"> |
| <list /> |
| </option> |
| <option name="index" value="0" /> |
| <option name="initialState" /> |
| <option name="initializedFromDependency" value="false" /> |
| <option name="length" value="6" /> |
| <option name="offset" value="934" /> |
| <option name="placeholderDependency" /> |
| <option name="placeholderText" value="TODO()" /> |
| <option name="possibleAnswer" value="beam.CombineGlobally(sum)" /> |
| <option name="selected" value="false" /> |
| <option name="status" value="Unchecked" /> |
| <option name="studentAnswer" /> |
| <option name="useLength" value="false" /> |
| </AnswerPlaceholder> |
| </list> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="task.py" /> |
| <option name="text" value="# TODO: type solution here " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="true" /> |
| </TaskFile> |
| </value> |
| </entry> |
| <entry key="tests.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list /> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="tests.py" /> |
| <option name="text" value="from test_helper import run_common_tests, failed, passed, get_answer_placeholders def test_answer_placeholders(): placeholders = get_answer_placeholders() placeholder = placeholders[0] if placeholder == "": # TODO: your condition here passed() else: failed() if __name__ == '__main__': run_common_tests() # test_answer_placeholders() # TODO: uncomment test call " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="false" /> |
| </TaskFile> |
| </value> |
| </entry> |
| </map> |
| </option> |
| <option name="updateDate" value="1560938232928" /> |
| </EduTask> |
| <EduTask> |
| <option name="customPresentableName" /> |
| <option name="descriptionFormat" value="HTML" /> |
| <option name="descriptionText" value="<!-- ~ Licensed to the Apache Software Foundation (ASF) under one ~ or more contributor license agreements. See the NOTICE file ~ distributed with this work for additional information ~ regarding copyright ownership. The ASF licenses this file ~ to you under the Apache License, Version 2.0 (the ~ "License"); you may not use this file except in compliance ~ with the License. You may obtain a copy of the License at ~ ~ http://www.apache.org/licenses/LICENSE-2.0 ~ ~ Unless required by applicable law or agreed to in writing, software ~ distributed under the License is distributed on an "AS IS" BASIS, ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. ~ See the License for the specific language governing permissions and ~ limitations under the License. --> <html> <h2>Aggregation - Mean</h2> <p> <b>Kata:</b> Compute the mean/average of all elements from an input. </p> <br> <div class="hint"> Use <a href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.combiners.html#apache_beam.transforms.combiners.Mean"> Mean</a>. </div> </html> " /> |
| <option name="feedbackLink"> |
| <FeedbackLink> |
| <option name="link" /> |
| <option name="type" value="STEPIK" /> |
| </FeedbackLink> |
| </option> |
| <option name="id" value="755599" /> |
| <option name="index" value="3" /> |
| <option name="name" value="Mean" /> |
| <option name="record" value="-1" /> |
| <option name="status" value="Unchecked" /> |
| <option name="stepikChangeStatus" value="Up to date" /> |
| <option name="files"> |
| <map> |
| <entry key="task.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list> |
| <AnswerPlaceholder> |
| <option name="hints"> |
| <list /> |
| </option> |
| <option name="index" value="0" /> |
| <option name="initialState" /> |
| <option name="initializedFromDependency" value="false" /> |
| <option name="length" value="6" /> |
| <option name="offset" value="934" /> |
| <option name="placeholderDependency" /> |
| <option name="placeholderText" value="TODO()" /> |
| <option name="possibleAnswer" value="beam.combiners.Mean.Globally()" /> |
| <option name="selected" value="false" /> |
| <option name="status" value="Unchecked" /> |
| <option name="studentAnswer" /> |
| <option name="useLength" value="false" /> |
| </AnswerPlaceholder> |
| </list> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="task.py" /> |
| <option name="text" value="# TODO: type solution here " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="true" /> |
| </TaskFile> |
| </value> |
| </entry> |
| <entry key="tests.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list /> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="tests.py" /> |
| <option name="text" value="from test_helper import run_common_tests, failed, passed, get_answer_placeholders def test_answer_placeholders(): placeholders = get_answer_placeholders() placeholder = placeholders[0] if placeholder == "": # TODO: your condition here passed() else: failed() if __name__ == '__main__': run_common_tests() # test_answer_placeholders() # TODO: uncomment test call " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="false" /> |
| </TaskFile> |
| </value> |
| </entry> |
| </map> |
| </option> |
| <option name="updateDate" value="1560938235730" /> |
| </EduTask> |
| <EduTask> |
| <option name="customPresentableName" /> |
| <option name="descriptionFormat" value="HTML" /> |
| <option name="descriptionText" value="<!-- ~ Licensed to the Apache Software Foundation (ASF) under one ~ or more contributor license agreements. See the NOTICE file ~ distributed with this work for additional information ~ regarding copyright ownership. The ASF licenses this file ~ to you under the Apache License, Version 2.0 (the ~ "License"); you may not use this file except in compliance ~ with the License. You may obtain a copy of the License at ~ ~ http://www.apache.org/licenses/LICENSE-2.0 ~ ~ Unless required by applicable law or agreed to in writing, software ~ distributed under the License is distributed on an "AS IS" BASIS, ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. ~ See the License for the specific language governing permissions and ~ limitations under the License. --> <html> <h2>Aggregation - Smallest</h2> <p> <b>Kata:</b> Compute the smallest of the elements from an input. </p> <br> <div class="hint"> Use <a href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.combiners.html#apache_beam.transforms.combiners.Top.Smallest"> Top.Smallest</a>. </div> </html> " /> |
| <option name="feedbackLink"> |
| <FeedbackLink> |
| <option name="link" /> |
| <option name="type" value="STEPIK" /> |
| </FeedbackLink> |
| </option> |
| <option name="id" value="755600" /> |
| <option name="index" value="4" /> |
| <option name="name" value="Smallest" /> |
| <option name="record" value="-1" /> |
| <option name="status" value="Unchecked" /> |
| <option name="stepikChangeStatus" value="Up to date" /> |
| <option name="files"> |
| <map> |
| <entry key="task.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list> |
| <AnswerPlaceholder> |
| <option name="hints"> |
| <list /> |
| </option> |
| <option name="index" value="0" /> |
| <option name="initialState" /> |
| <option name="initializedFromDependency" value="false" /> |
| <option name="length" value="6" /> |
| <option name="offset" value="934" /> |
| <option name="placeholderDependency" /> |
| <option name="placeholderText" value="TODO()" /> |
| <option name="possibleAnswer" value="beam.combiners.Top.Smallest(1)" /> |
| <option name="selected" value="false" /> |
| <option name="status" value="Unchecked" /> |
| <option name="studentAnswer" /> |
| <option name="useLength" value="false" /> |
| </AnswerPlaceholder> |
| </list> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="task.py" /> |
| <option name="text" value="# TODO: type solution here " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="true" /> |
| </TaskFile> |
| </value> |
| </entry> |
| <entry key="tests.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list /> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="tests.py" /> |
| <option name="text" value="from test_helper import run_common_tests, failed, passed, get_answer_placeholders def test_answer_placeholders(): placeholders = get_answer_placeholders() placeholder = placeholders[0] if placeholder == "": # TODO: your condition here passed() else: failed() if __name__ == '__main__': run_common_tests() # test_answer_placeholders() # TODO: uncomment test call " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="false" /> |
| </TaskFile> |
| </value> |
| </entry> |
| </map> |
| </option> |
| <option name="updateDate" value="1560938237747" /> |
| </EduTask> |
| <EduTask> |
| <option name="customPresentableName" /> |
| <option name="descriptionFormat" value="HTML" /> |
| <option name="descriptionText" value="<!-- ~ Licensed to the Apache Software Foundation (ASF) under one ~ or more contributor license agreements. See the NOTICE file ~ distributed with this work for additional information ~ regarding copyright ownership. The ASF licenses this file ~ to you under the Apache License, Version 2.0 (the ~ "License"); you may not use this file except in compliance ~ with the License. You may obtain a copy of the License at ~ ~ http://www.apache.org/licenses/LICENSE-2.0 ~ ~ Unless required by applicable law or agreed to in writing, software ~ distributed under the License is distributed on an "AS IS" BASIS, ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. ~ See the License for the specific language governing permissions and ~ limitations under the License. --> <html> <h2>Aggregation - Largest</h2> <p> <b>Kata:</b> Compute the largest of the elements from an input. </p> <br> <div class="hint"> Use <a href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.combiners.html#apache_beam.transforms.combiners.Top.Largest"> Top.Largest</a>. </div> </html> " /> |
| <option name="feedbackLink"> |
| <FeedbackLink> |
| <option name="link" /> |
| <option name="type" value="STEPIK" /> |
| </FeedbackLink> |
| </option> |
| <option name="id" value="755601" /> |
| <option name="index" value="5" /> |
| <option name="name" value="Largest" /> |
| <option name="record" value="-1" /> |
| <option name="status" value="Unchecked" /> |
| <option name="stepikChangeStatus" value="Up to date" /> |
| <option name="files"> |
| <map> |
| <entry key="task.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list> |
| <AnswerPlaceholder> |
| <option name="hints"> |
| <list /> |
| </option> |
| <option name="index" value="0" /> |
| <option name="initialState" /> |
| <option name="initializedFromDependency" value="false" /> |
| <option name="length" value="6" /> |
| <option name="offset" value="934" /> |
| <option name="placeholderDependency" /> |
| <option name="placeholderText" value="TODO()" /> |
| <option name="possibleAnswer" value="beam.combiners.Top.Largest(1)" /> |
| <option name="selected" value="false" /> |
| <option name="status" value="Unchecked" /> |
| <option name="studentAnswer" /> |
| <option name="useLength" value="false" /> |
| </AnswerPlaceholder> |
| </list> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="task.py" /> |
| <option name="text" value="# TODO: type solution here " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="true" /> |
| </TaskFile> |
| </value> |
| </entry> |
| <entry key="tests.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list /> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="tests.py" /> |
| <option name="text" value="from test_helper import run_common_tests, failed, passed, get_answer_placeholders def test_answer_placeholders(): placeholders = get_answer_placeholders() placeholder = placeholders[0] if placeholder == "": # TODO: your condition here passed() else: failed() if __name__ == '__main__': run_common_tests() # test_answer_placeholders() # TODO: uncomment test call " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="false" /> |
| </TaskFile> |
| </value> |
| </entry> |
| </map> |
| </option> |
| <option name="updateDate" value="1560938239860" /> |
| </EduTask> |
| </list> |
| </option> |
| </Lesson> |
| </list> |
| </option> |
| </Section> |
| <Section> |
| <option name="courseId" value="54532" /> |
| <option name="customPresentableName" /> |
| <option name="id" value="88017" /> |
| <option name="index" value="4" /> |
| <option name="name" value="IO" /> |
| <option name="position" value="5" /> |
| <option name="stepikChangeStatus" value="Up to date" /> |
| <option name="updateDate" value="1560436240000" /> |
| <option name="items"> |
| <list> |
| <Lesson> |
| <option name="customPresentableName" /> |
| <option name="id" value="238439" /> |
| <option name="index" value="1" /> |
| <option name="name" value="TextIO" /> |
| <option name="stepikChangeStatus" value="Content changed" /> |
| <option name="updateDate" value="1560938245888" /> |
| <option name="unitId" value="210899" /> |
| <option name="items"> |
| <list> |
| <EduTask> |
| <option name="customPresentableName" /> |
| <option name="descriptionFormat" value="HTML" /> |
| <option name="descriptionText" value="<!-- ~ Licensed to the Apache Software Foundation (ASF) under one ~ or more contributor license agreements. See the NOTICE file ~ distributed with this work for additional information ~ regarding copyright ownership. The ASF licenses this file ~ to you under the Apache License, Version 2.0 (the ~ "License"); you may not use this file except in compliance ~ with the License. You may obtain a copy of the License at ~ ~ http://www.apache.org/licenses/LICENSE-2.0 ~ ~ Unless required by applicable law or agreed to in writing, software ~ distributed under the License is distributed on an "AS IS" BASIS, ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. ~ See the License for the specific language governing permissions and ~ limitations under the License. --> <html> <h2>ReadFromText</h2> <p> When you create a pipeline, you often need to read data from some external source, such as a file or a database. Likewise, you may want your pipeline to output its result data to an external storage system. Beam provides read and write transforms for a number of common data storage types. If you want your pipeline to read from or write to a data storage format that isn’t supported by the built-in transforms, you can implement your own read and write transforms. </p> <p> To read a PCollection from one or more text files, use beam.io.ReadFromText to instantiate a transform and specify the path of the file(s) to be read. </p> <p> <b>Kata:</b> Read the 'countries.txt' file and convert each country name into uppercase. </p> <br> <div class="hint"> Use <a href="https://beam.apache.org/releases/pydoc/current/apache_beam.io.textio.html#apache_beam.io.textio.ReadFromText"> beam.io.ReadFromText</a>. </div> <div class="hint"> Refer to the Beam Programming Guide <a href="https://beam.apache.org/documentation/programming-guide/#pipeline-io-reading-data"> "Reading input data"</a> section for more information. </div> </html> " /> |
| <option name="feedbackLink"> |
| <FeedbackLink> |
| <option name="link" /> |
| <option name="type" value="STEPIK" /> |
| </FeedbackLink> |
| </option> |
| <option name="id" value="755602" /> |
| <option name="index" value="1" /> |
| <option name="name" value="ReadFromText" /> |
| <option name="record" value="-1" /> |
| <option name="status" value="Unchecked" /> |
| <option name="stepikChangeStatus" value="Up to date" /> |
| <option name="files"> |
| <map> |
| <entry key="task.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list> |
| <AnswerPlaceholder> |
| <option name="hints"> |
| <list /> |
| </option> |
| <option name="index" value="0" /> |
| <option name="initialState" /> |
| <option name="initializedFromDependency" value="false" /> |
| <option name="length" value="6" /> |
| <option name="offset" value="919" /> |
| <option name="placeholderDependency" /> |
| <option name="placeholderText" value="TODO()" /> |
| <option name="possibleAnswer" value="beam.io.ReadFromText(file_path)" /> |
| <option name="selected" value="false" /> |
| <option name="status" value="Unchecked" /> |
| <option name="studentAnswer" /> |
| <option name="useLength" value="false" /> |
| </AnswerPlaceholder> |
| <AnswerPlaceholder> |
| <option name="hints"> |
| <list /> |
| </option> |
| <option name="index" value="1" /> |
| <option name="initialState" /> |
| <option name="initializedFromDependency" value="false" /> |
| <option name="length" value="6" /> |
| <option name="offset" value="956" /> |
| <option name="placeholderDependency" /> |
| <option name="placeholderText" value="TODO()" /> |
| <option name="possibleAnswer" value="beam.Map(lambda country: country.upper())" /> |
| <option name="selected" value="false" /> |
| <option name="status" value="Unchecked" /> |
| <option name="studentAnswer" /> |
| <option name="useLength" value="false" /> |
| </AnswerPlaceholder> |
| </list> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="task.py" /> |
| <option name="text" value="# TODO: type solution here " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="true" /> |
| </TaskFile> |
| </value> |
| </entry> |
| <entry key="tests.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list /> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="tests.py" /> |
| <option name="text" value="from test_helper import run_common_tests, failed, passed, get_answer_placeholders def test_answer_placeholders(): placeholders = get_answer_placeholders() placeholder = placeholders[0] if placeholder == "": # TODO: your condition here passed() else: failed() if __name__ == '__main__': run_common_tests() # test_answer_placeholders() # TODO: uncomment test call " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="false" /> |
| </TaskFile> |
| </value> |
| </entry> |
| <entry key="countries.txt"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list /> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="countries.txt" /> |
| <option name="text" value="" /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="true" /> |
| </TaskFile> |
| </value> |
| </entry> |
| </map> |
| </option> |
| <option name="updateDate" value="1560938252130" /> |
| </EduTask> |
| </list> |
| </option> |
| </Lesson> |
| <Lesson> |
| <option name="customPresentableName" /> |
| <option name="id" value="238440" /> |
| <option name="index" value="2" /> |
| <option name="name" value="Built-in IOs" /> |
| <option name="stepikChangeStatus" value="Content changed" /> |
| <option name="updateDate" value="1560938258337" /> |
| <option name="unitId" value="210900" /> |
| <option name="items"> |
| <list> |
| <EduTask> |
| <option name="customPresentableName" /> |
| <option name="descriptionFormat" value="HTML" /> |
| <option name="descriptionText" value="<!-- ~ Licensed to the Apache Software Foundation (ASF) under one ~ or more contributor license agreements. See the NOTICE file ~ distributed with this work for additional information ~ regarding copyright ownership. The ASF licenses this file ~ to you under the Apache License, Version 2.0 (the ~ "License"); you may not use this file except in compliance ~ with the License. You may obtain a copy of the License at ~ ~ http://www.apache.org/licenses/LICENSE-2.0 ~ ~ Unless required by applicable law or agreed to in writing, software ~ distributed under the License is distributed on an "AS IS" BASIS, ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. ~ See the License for the specific language governing permissions and ~ limitations under the License. --> <html> <h2>Built-in I/Os</h2> <p> Beam SDKs provide many out of the box I/O transforms that can be used to read from many different sources and write to many different sinks. </p> <p> See the <a href="https://beam.apache.org/documentation/io/built-in/">Beam-provided I/O Transforms</a> page for a list of the currently available I/O transforms. </p> </html> " /> |
| <option name="feedbackLink"> |
| <FeedbackLink> |
| <option name="link" /> |
| <option name="type" value="STEPIK" /> |
| </FeedbackLink> |
| </option> |
| <option name="id" value="755603" /> |
| <option name="index" value="1" /> |
| <option name="name" value="Built-in IOs" /> |
| <option name="record" value="-1" /> |
| <option name="status" value="Unchecked" /> |
| <option name="stepikChangeStatus" value="Up to date" /> |
| <option name="files"> |
| <map> |
| <entry key="task.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list /> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="task.py" /> |
| <option name="text" value="# TODO: type solution here " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="true" /> |
| </TaskFile> |
| </value> |
| </entry> |
| <entry key="tests.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list /> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="tests.py" /> |
| <option name="text" value="" /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="false" /> |
| </TaskFile> |
| </value> |
| </entry> |
| </map> |
| </option> |
| <option name="updateDate" value="1560938263697" /> |
| </EduTask> |
| </list> |
| </option> |
| </Lesson> |
| </list> |
| </option> |
| </Section> |
| <Section> |
| <option name="courseId" value="54532" /> |
| <option name="customPresentableName" /> |
| <option name="id" value="85647" /> |
| <option name="index" value="5" /> |
| <option name="name" value="Examples" /> |
| <option name="position" value="0" /> |
| <option name="stepikChangeStatus" value="Up to date" /> |
| <option name="updateDate" value="1560435414000" /> |
| <option name="items"> |
| <list> |
| <Lesson> |
| <option name="customPresentableName" /> |
| <option name="id" value="238441" /> |
| <option name="index" value="1" /> |
| <option name="name" value="Word Count" /> |
| <option name="stepikChangeStatus" value="Content changed" /> |
| <option name="updateDate" value="1560938269193" /> |
| <option name="unitId" value="210901" /> |
| <option name="items"> |
| <list> |
| <EduTask> |
| <option name="customPresentableName" /> |
| <option name="descriptionFormat" value="HTML" /> |
| <option name="descriptionText" value="<!-- ~ Licensed to the Apache Software Foundation (ASF) under one ~ or more contributor license agreements. See the NOTICE file ~ distributed with this work for additional information ~ regarding copyright ownership. The ASF licenses this file ~ to you under the Apache License, Version 2.0 (the ~ "License"); you may not use this file except in compliance ~ with the License. You may obtain a copy of the License at ~ ~ http://www.apache.org/licenses/LICENSE-2.0 ~ ~ Unless required by applicable law or agreed to in writing, software ~ distributed under the License is distributed on an "AS IS" BASIS, ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. ~ See the License for the specific language governing permissions and ~ limitations under the License. --> <html> <h2>Word Count Pipeline</h2> <p> <b>Kata:</b> Create a pipeline that counts the number of words. </p> <p> Please output the count of each word in the following format: </p> <pre> word:count ball:5 book:3 </pre> <br> <div class="hint"> Refer to your katas above. </div> </html> " /> |
| <option name="feedbackLink"> |
| <FeedbackLink> |
| <option name="link" /> |
| <option name="type" value="STEPIK" /> |
| </FeedbackLink> |
| </option> |
| <option name="id" value="755604" /> |
| <option name="index" value="1" /> |
| <option name="name" value="Word Count" /> |
| <option name="record" value="-1" /> |
| <option name="status" value="Unchecked" /> |
| <option name="stepikChangeStatus" value="Up to date" /> |
| <option name="files"> |
| <map> |
| <entry key="task.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list> |
| <AnswerPlaceholder> |
| <option name="hints"> |
| <list /> |
| </option> |
| <option name="index" value="0" /> |
| <option name="initialState" /> |
| <option name="initializedFromDependency" value="false" /> |
| <option name="length" value="6" /> |
| <option name="offset" value="1021" /> |
| <option name="placeholderDependency" /> |
| <option name="placeholderText" value="TODO()" /> |
| <option name="possibleAnswer" value="beam.FlatMap(lambda sentence: sentence.split()) | beam.combiners.Count.PerElement() | beam.Map(lambda (k, v): k + ":" + str(v))" /> |
| <option name="selected" value="false" /> |
| <option name="status" value="Unchecked" /> |
| <option name="studentAnswer" /> |
| <option name="useLength" value="false" /> |
| </AnswerPlaceholder> |
| </list> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="task.py" /> |
| <option name="text" value="# TODO: type solution here " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="true" /> |
| </TaskFile> |
| </value> |
| </entry> |
| <entry key="tests.py"> |
| <value> |
| <TaskFile> |
| <option name="answerPlaceholders"> |
| <list /> |
| </option> |
| <option name="highlightErrors" value="true" /> |
| <option name="name" value="tests.py" /> |
| <option name="text" value="from test_helper import run_common_tests, failed, passed, get_answer_placeholders def test_answer_placeholders(): placeholders = get_answer_placeholders() placeholder = placeholders[0] if placeholder == "": # TODO: your condition here passed() else: failed() if __name__ == '__main__': run_common_tests() # test_answer_placeholders() # TODO: uncomment test call " /> |
| <option name="trackChanges" value="true" /> |
| <option name="trackLengths" value="true" /> |
| <option name="visible" value="false" /> |
| </TaskFile> |
| </value> |
| </entry> |
| </map> |
| </option> |
| <option name="updateDate" value="1560938273811" /> |
| </EduTask> |
| </list> |
| </option> |
| </Lesson> |
| </list> |
| </option> |
| </Section> |
| </list> |
| </option> |
| </EduCourse> |
| </option> |
| </StudyTaskManager> |
| </component> |
| </project> |