| <?xml version="1.0" encoding="UTF-8"?> |
| <!-- |
| Licensed to the Apache Software Foundation (ASF) under one or more |
| contributor license agreements. See the NOTICE file distributed with |
| this work for additional information regarding copyright ownership. |
| The ASF licenses this file to You under the Apache License, Version 2.0 |
| (the "License"); you may not use this file except in compliance with |
| the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, software |
| distributed under the License is distributed on an "AS IS" BASIS, |
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| See the License for the specific language governing permissions and |
| limitations under the License. |
| --> |
| <!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "http://forrest.apache.org/dtd/document-v20.dtd"> |
| <document> |
| <header> |
| <title>Administration</title> |
| </header> |
| <body> |
| |
| <!-- ========================================================== --> |
| |
| <!-- SET UP OUTPUT LOCATION STRICT CHECK --> |
| <section> |
| <title>Output location strict check</title> |
| <p>Pig scripts could contain multiple STORE statements. There are cases when one would like to avoid writing to the same output location. Pig provides admins/script writers with a property to check if multiple STORE statements make an attempt to write to the same output directory. And fail fast letting the user know of the same.</p> |
| <p>Specifically this makes sense for file-based output locations (HDFS, Local FS, S3..) to avoid Pig script from failing when multiple MR jobs write to the same location. </p> |
| <p>To enforce strict checking of output location, set <strong>pig.location.check.strict=true</strong>. See also <a href="start.html#properties">Pig Properties</a> on how to set this property.</p> |
| </section> |
| |
| <!-- DISABLE PIG COMMANDS AND OPERATORS --> |
| <section> |
| <title>Disabling Pig commands and operators</title> |
| <p>This is an admin feature providing ability to blacklist or/and whitelist certain commands and operations. Pig exposes a few of these that could be not very safe in a multitenant environment. For example, "sh" invokes shell commands, "set" allows users to change non-final configs. While these are tremendously useful in general, having an ability to disable would make Pig a safer platform. The goal is to allow administrators to be able to have more control over user scripts. Default behaviour would still be the same - no filters applied on commands and operators.</p> |
| <p>There are two properties you can use to control what users are able to do</p> |
| <ul> |
| <li>pig.blacklist</li> |
| <li>pig.whitelist</li> |
| </ul> |
| <section> |
| <title>Blacklisting</title> |
| <p>Set "pig.blacklist" to a comma-delimited set of operators and commands. For eg, <strong>pig.blacklist=rm,kill,cross</strong> would disable users from executing any of "rm", "kill" commands and "cross" operator.</p> |
| </section> |
| <section> |
| <title>Whitelisting</title> |
| <p>This is an even safer approach to disallowing functionality in Pig. Using this you will be able to disable all commands and operators that are not a part of the whitelist. For eg, <strong>pig.whitelist=load,filter,store</strong> will disallow every command and operator other than "load", "filter" and "store".</p> |
| </section> |
| <section> |
| <title>Note</title> |
| <p>There should not be any conflicts between blacklist and whitelist. Make sure to have them entirely distinct or Pig will complain.</p> |
| </section> |
| </section> |
| |
| </body> |
| </document> |