用什么打开confing.mapx控件

点击联系发帖人 时间：2015-06-02 09:07

mapx下载

All Codehaus services have now been terminated.
With increasing diversity in opensource hosting platforms
like Github and Bitbucket - who are meeting the needs of
1000s of projects - it made sense to end the opensource
hosting services of Codehaus.
- to find out more about what Codehaus was!
- to fix your Maven configuration
- to provide redirections
- to update this site芦田爱菜&铃木福マル.マル.モリ.モリ！体操教学_土豆_高清视频在线观看JobConf (Apache Hadoop Main 2.7.0 API)
JavaScript is disabled on your browser.
Class JobConf
org.apache.hadoop.mapred.JobConf
All Implemented Interfaces:
@InterfaceAudience.Public
@InterfaceStability.Stable
public class
A map/reduce job configuration.
JobConf is the primary interface for a user to describe a
map-reduce job to the Hadoop framework for execution. The framework tries to
faithfully execute the job as-is described by JobConf, however:
Some configuration parameters might have been marked as
by administrators and hence cannot be altered.
While some job parameters are straight-forward to set
(e.g. ), some parameters interact subtly
with the rest of the framework and/or job-configuration and is relatively
more complex for the user to control finely
JobConf typically specifies the , combiner
(if any), , ,
implementations to be used etc.
Optionally JobConf is used to specify other advanced facets
of the job such as Comparators to be used, files to be put in
the DistributedCache, whether or not intermediate and/or job outputs
are to be compressed (and how), debugability via user-provided scripts
for doing post-processing on task logs, task's stdout, stderr, syslog.
Here is an example on how to configure a job via JobConf:
// Create a new JobConf
JobConf job = new JobConf(new Configuration(), MyJob.class);
// Specify various job-specific parameters
job.setJobName("myjob");
FileInputFormat.setInputPaths(job, new Path("in"));
FileOutputFormat.setOutputPath(job, new Path("out"));
job.setMapperClass(MyJob.MyMapper.class);
job.setCombinerClass(MyJob.MyReducer.class);
job.setReducerClass(MyJob.MyReducer.class);
job.setInputFormat(SequenceFileInputFormat.class);
job.setOutputFormat(SequenceFileOutputFormat.class);
See Also:,
DistributedCache
Field Summary
Modifier and Type
Field and Description
static org.apache.log4j.Level
Default logging level for map/reduce tasks.
static boolean
Deprecated.&
Name of the queue to which jobs will be submitted, if no queue
name is mentioned.
static long
Deprecated.&
Deprecated.&
Deprecated.&
Property name for the configuration property mapreduce.cluster.local.dir
Configuration key to set the environment of the child map tasks.
Configuration key to set the java command line options for the map tasks.
Configuration key to set the logging Level for the map task.
Deprecated.&
Configuration key to set the maximum virtual memory available to the
map tasks (in kilo-bytes). This has been deprecated and will no
longer have any effect.
Configuration key to set the environment of the child reduce tasks.
Configuration key to set the java command line options for the reduce tasks.
Configuration key to set the logging Level for the reduce task.
Deprecated.&
Configuration key to set the maximum virtual memory available to the
reduce tasks (in kilo-bytes). This has been deprecated and will no
longer have any effect.
Deprecated.&&
Deprecated.&
Deprecated.&
Deprecated.&&
Deprecated.&
Deprecated.&
Configuration key to set the maximum virtual memory available to the child
map and reduce tasks (in kilo-bytes). This has been deprecated and will no
longer have any effect.
Deprecated.&
Pattern for the default unpacking behavior for job jars
Deprecated.&&
Deprecated.&
Deprecated.&
Deprecated.&
Deprecated.&
Deprecated.&
Deprecated.&
Constructor Summary
Constructors&
Constructor and Description
Construct a map/reduce job configuration.
(boolean&loadDefaults)
A new map/reduce configuration where the behavior of reading from the
default resources can be turned off.
(&exampleClass)
Construct a map/reduce job configuration.
Construct a map/reduce job configuration.
&exampleClass)
Construct a map/reduce job configuration.
Construct a map/reduce configuration.
Construct a map/reduce configuration.
Method Summary
Modifier and Type
Method and Description
Deprecated.&
(&subdir)&
(&my_class)
Find a jar that contains a class of the same name, if any.
&? extends &
Get the user-defined combiner class used to combine map-outputs
before being sent to the reducers.
Get the user defined
comparator for
grouping keys of inputs to the combiner.
Are the outputs of the maps be compressed?
org.apache.hadoop.security.Credentials
Get credentials for the job.
implementation for the map-reduce job,
defaults to
if not specified explicity.
Get the user jar for the map-reduce job.
Get the pattern for jar contents to unpack on the tasktracker
Get the uri to be invoked in-order to send a notification after the job
has completed (success/failure).
Get job-specific shared directory for use as scratch space
Get the user-specified job name.
for this job.
Should the temporary files for failed tasks be kept?
Get the regular expression that is matched against the task names
to see if we need to keep the files.
(&pathString)
Constructs a local file name.
Get the map task's debug script.
&? extends &
(&? extends &&defaultValue)
for compressing the map outputs.
Get the key class for the map output data.
Get the value class for the map output data.
&? extends &
class for the job.
&? extends &
class for the job.
Should speculative execution be used for this job for map tasks?
Defaults to true.
Get the configured number of maximum attempts that will be made to run a
map task, as specified by the mapreduce.map.maxattempts
Get the maximum percentage of map tasks that can fail without
the job being aborted.
Deprecated.&
this variable is deprecated and nolonger in use.
Get the configured number of maximum attempts
that will be made to run a
reduce task, as specified by the mapreduce.reduce.maxattempts
Get the maximum percentage of reduce tasks that can fail without
the job being aborted.
Expert: Get the maximum no.
Deprecated.&
Get memory required to run a map task of the job, in MB.
Get memory required to run a reduce task of the job, in MB.
Get configured the number of reduce tasks for this job.
Get configured the number of reduce tasks for this job.
Get the number of tasks that a spawned JVM should execute
implementation for the map-reduce job,
defaults to
if not specified explicitly.
implementation for the map-reduce job,
defaults to
if not specified explicity.
Get the key class for the job output data.
comparator used to compare keys.
Get the value class for job outputs.
Get the user defined
comparator for
grouping keys of inputs to the reduce.
&? extends &
used to partition -outputs
to be sent to the s.
Get whether the task profiling is enabled.
Get the profiler configuration arguments.
org.apache.hadoop.conf.Configuration.IntegerRanges
(boolean&isMap)
Get the range of maps or reduces to profile.
Return the name of the queue to which this job is submitted.
Get the reduce task's debug Script
&? extends &
class for the job.
Should speculative execution be used for this job for reduce tasks?
Defaults to true.
Deprecated.&
Should speculative execution be used for this job?
Defaults to true.
Should the framework use the new context-object code for running
the mapper?
Should the framework use the new context-object code for running
the reducer?
Get the reported username for this job.
Get the current working directory for the default file system.
static long
(long&val)
Normalize the negative values in configuration
(&? extends &&theClass)
Set the user-defined combiner class used to combine map-outputs
before being sent to the reducers.
(&? extends &&theClass)
Set the user defined
comparator for
grouping keys in the input to the combiner.
(boolean&compress)
Should the map outputs be compressed before transfer?
(&? extends &&theClass)
implementation for the map-reduce job.
Set the user jar for the map-reduce job.
Set the job's jar file by finding an example class location.
Set the uri to be invoked in-order to send a notification after the job
has completed (success/failure).
Set the user-specified job name.
for this job.
(boolean&keep)
Set whether the framework should keep the intermediate files for
failed tasks.
(&pattern)
Set a regular expression for task names that should be kept.
(&keySpec)
options used to compare keys.
(&keySpec)
options used for
(&mDbgScript)
Set the debug script to run when the map tasks fail.
(&? extends &&codecClass)
Set the given class as the
for the map outputs.
(&?&&theClass)
Set the key class for the map output data.
(&?&&theClass)
Set the value class for the map output data.
(&? extends &&theClass)
class for the job.
(&? extends &&theClass)
Expert: Set the
class for the job.
(boolean&speculativeExecution)
Turn speculative execution on or off for this job for map tasks.
Expert: Set the number of maximum attempts that will be made to run a
(int&percent)
Expert: Set the maximum percentage of map tasks that can fail without the
job being aborted.
(long&mem)
Deprecated.&
Expert: Set the number of maximum attempts that will be made to run a
reduce task.
(int&percent)
Set the maximum percentage of reduce tasks that can fail without the job
being aborted.
(int&noFailures)
Set the maximum no.
(long&vmem)
Deprecated.&
(long&mem)&
(long&mem)&
Set the number of map tasks for this job.
Set the requisite number of reduce tasks for this job.
(int&numTasks)
Sets the number of tasks that a spawned task JVM should run
before it exits
(&? extends &&theClass)
implementation for the map-reduce job.
(&? extends &&theClass)
implementation for the map-reduce job.
(&?&&theClass)
Set the key class for the job output data.
(&? extends &&theClass)
comparator used to compare keys.
(&?&&theClass)
Set the value class for job outputs.
(&? extends &&theClass)
Set the user defined
comparator for
grouping keys in the input to the reduce.
(&? extends &&theClass)
class used to partition
-outputs to be sent to the s.
(boolean&newValue)
Set whether the system should collect profiler information for some of
the tasks in this job? The information is stored in the user log
directory.
Set the profiler configuration arguments.
(boolean&isMap,
&newValue)
Set the ranges of maps or reduces to profile.
(&queueName)
Set the name of the queue to which this job should be submitted.
(&rDbgScript)
Set the debug script to run when the reduce tasks fail.
(&? extends &&theClass)
class for the job.
(boolean&speculativeExecution)
Turn speculative execution on or off for this job for reduce tasks.
(&sessionId)
Deprecated.&
(boolean&speculativeExecution)
Turn speculative execution on or off for this job.
(boolean&flag)
Set whether the framework should use the new api for the mapper.
(boolean&flag)
Set whether the framework should use the new api for the reducer.
Set the reported username for this job.
Set the current working directory for the default file system.
Methods inherited from class&org.apache.hadoop.conf.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Methods inherited from class&java.lang.
, , , , , , , , ,
Field Detail
MAPRED_TASK_MAXVMEM_PROPERTY
public static final&
Deprecated.&Use
UPPER_LIMIT_ON_TASK_VMEM_PROPERTY
public static final&
Deprecated.&
MAPRED_TASK_DEFAULT_MAXVMEM_PROPERTY
public static final&
Deprecated.&
MAPRED_TASK_MAXPMEM_PROPERTY
public static final&
Deprecated.&
DISABLED_MEMORY_LIMIT
public static final&long
Deprecated.&
A value which if set for memory related configuration options,
indicates that the options are turned off.
Deprecated because it makes no sense in the context of MR2.
MAPRED_LOCAL_DIR_PROPERTY
public static final&
Property name for the configuration property mapreduce.cluster.local.dir
DEFAULT_QUEUE_NAME
public static final&
Name of the queue to which jobs will be submitted, if no queue
name is mentioned.
MAPRED_JOB_MAP_MEMORY_MB_PROPERTY
public static final&
Deprecated.&
The variable is kept for M/R 1.x applications, while M/R 2.x applications
should use
MAPRED_JOB_REDUCE_MEMORY_MB_PROPERTY
public static final&
Deprecated.&
The variable is kept for M/R 1.x applications, while M/R 2.x applications
should use
UNPACK_JAR_PATTERN_DEFAULT
public static final&
Pattern for the default unpacking behavior for job jars
MAPRED_TASK_JAVA_OPTS
public static final&
Deprecated.&Use
Configuration key to set the java command line options for the child
map and reduce tasks.
Java opts for the task tracker child processes.
The following symbol, if present, will be interpolated: @taskid@.
It is replaced by current TaskID. Any other occurrences of '@' will go
unchanged.
For example, to enable verbose gc logging to a file named for the taskid in
/tmp and to set the heap maximum to be a gigabyte, pass a 'value' of:
-Xmx1024m -verbose:gc -Xloggc:/tmp/@taskid@.gc
The configuration variable
can be used to pass
other environment variables to the child processes.
MAPRED_MAP_TASK_JAVA_OPTS
public static final&
Configuration key to set the java command line options for the map tasks.
Java opts for the task tracker child map processes.
The following symbol, if present, will be interpolated: @taskid@.
It is replaced by current TaskID. Any other occurrences of '@' will go
unchanged.
For example, to enable verbose gc logging to a file named for the taskid in
/tmp and to set the heap maximum to be a gigabyte, pass a 'value' of:
-Xmx1024m -verbose:gc -Xloggc:/tmp/@taskid@.gc
The configuration variable
can be used to pass
other environment variables to the map processes.
MAPRED_REDUCE_TASK_JAVA_OPTS
public static final&
Configuration key to set the java command line options for the reduce tasks.
Java opts for the task tracker child reduce processes.
The following symbol, if present, will be interpolated: @taskid@.
It is replaced by current TaskID. Any other occurrences of '@' will go
unchanged.
For example, to enable verbose gc logging to a file named for the taskid in
/tmp and to set the heap maximum to be a gigabyte, pass a 'value' of:
-Xmx1024m -verbose:gc -Xloggc:/tmp/@taskid@.gc
The configuration variable
can be used to
pass process environment variables to the reduce processes.
DEFAULT_MAPRED_TASK_JAVA_OPTS
public static final&
MAPRED_TASK_ULIMIT
public static final&
Deprecated.&Configuration key to set the maximum virtual memory available to the child
map and reduce tasks (in kilo-bytes). This has been deprecated and will no
longer have any effect.
MAPRED_MAP_TASK_ULIMIT
public static final&
Deprecated.&Configuration key to set the maximum virtual memory available to the
map tasks (in kilo-bytes). This has been deprecated and will no
longer have any effect.
MAPRED_REDUCE_TASK_ULIMIT
public static final&
Deprecated.&Configuration key to set the maximum virtual memory available to the
reduce tasks (in kilo-bytes). This has been deprecated and will no
longer have any effect.
MAPRED_TASK_ENV
public static final&
Deprecated.&Use
Configuration key to set the environment of the child map/reduce tasks.
The format of the value is k1=v1,k2=v2. Further it can
reference existing environment variables via $key on
Linux or %key% on Windows.
A=foo - This will set the env variable A to foo.
B=$X:c This is inherit tasktracker's X env variable on Linux.
B=%X%;c This is inherit tasktracker's X env variable on Windows.
MAPRED_MAP_TASK_ENV
public static final&
Configuration key to set the environment of the child map tasks.
The format of the value is k1=v1,k2=v2. Further it can
reference existing environment variables via $key on
Linux or %key% on Windows.
A=foo - This will set the env variable A to foo.
B=$X:c This is inherit tasktracker's X env variable on Linux.
B=%X%;c This is inherit tasktracker's X env variable on Windows.
MAPRED_REDUCE_TASK_ENV
public static final&
Configuration key to set the environment of the child reduce tasks.
The format of the value is k1=v1,k2=v2. Further it can
reference existing environment variables via $key on
Linux or %key% on Windows.
A=foo - This will set the env variable A to foo.
B=$X:c This is inherit tasktracker's X env variable on Linux.
B=%X%;c This is inherit tasktracker's X env variable on Windows.
MAPRED_MAP_TASK_LOG_LEVEL
public static final&
Configuration key to set the logging Level for the map task.
The allowed logging levels are:
OFF, FATAL, ERROR, WARN, INFO, DEBUG, TRACE and ALL.
MAPRED_REDUCE_TASK_LOG_LEVEL
public static final&
Configuration key to set the logging Level for the reduce task.
The allowed logging levels are:
OFF, FATAL, ERROR, WARN, INFO, DEBUG, TRACE and ALL.
DEFAULT_LOG_LEVEL
public static final&org.apache.log4j.Level
Default logging level for map/reduce tasks.
WORKFLOW_ID
public static final&
Deprecated.&
The variable is kept for M/R 1.x applications, M/R 2.x applications should
use MRJobConfig.WORKFLOW_ID instead
WORKFLOW_NAME
public static final&
Deprecated.&
The variable is kept for M/R 1.x applications, M/R 2.x applications should
use MRJobConfig.WORKFLOW_NAME instead
WORKFLOW_NODE_NAME
public static final&
Deprecated.&
The variable is kept for M/R 1.x applications, M/R 2.x applications should
use MRJobConfig.WORKFLOW_NODE_NAME instead
WORKFLOW_ADJACENCY_PREFIX_STRING
public static final&
Deprecated.&
The variable is kept for M/R 1.x applications, M/R 2.x applications should
use MRJobConfig.WORKFLOW_ADJACENCY_PREFIX_STRING instead
WORKFLOW_ADJACENCY_PREFIX_PATTERN
public static final&
Deprecated.&
The variable is kept for M/R 1.x applications, M/R 2.x applications should
use MRJobConfig.WORKFLOW_ADJACENCY_PREFIX_PATTERN instead
WORKFLOW_TAGS
public static final&
Deprecated.&
The variable is kept for M/R 1.x applications, M/R 2.x applications should
use MRJobConfig.WORKFLOW_TAGS instead
MAPREDUCE_RECOVER_JOB
public static final&
Deprecated.&
The variable is kept for M/R 1.x applications, M/R 2.x applications should
not use it
DEFAULT_MAPREDUCE_RECOVER_JOB
public static final&boolean
Deprecated.&
The variable is kept for M/R 1.x applications, M/R 2.x applications should
not use it
Constructor Detail
Construct a map/reduce job configuration.
public&(&exampleClass)
Construct a map/reduce job configuration.
Parameters:exampleClass - a class whose containing jar is used as the job's jar.
public&(&conf)
Construct a map/reduce job configuration.
Parameters:conf - a Configuration whose settings will be inherited.
public&(&conf,
&exampleClass)
Construct a map/reduce job configuration.
Parameters:conf - a Configuration whose settings will be inherited.exampleClass - a class whose containing jar is used as the job's jar.
public&(&config)
Construct a map/reduce configuration.
Parameters:config - a Configuration-format XML job description file.
public&(&config)
Construct a map/reduce configuration.
Parameters:config - a Configuration-format XML job description file.
public&(boolean&loadDefaults)
A new map/reduce configuration where the behavior of reading from the
default resources can be turned off.
If the parameter loadDefaults is false, the new instance
will not load resources from the default files.
Parameters:loadDefaults - specifies whether to load from the default files
Method Detail
getCredentials
public&org.apache.hadoop.security.Credentials&()
Get credentials for the job.
Returns:credentials for the job
public&&()
Get the user jar for the map-reduce job.
Returns:the user jar for the map-reduce job.
public&void&(&jar)
Set the user jar for the map-reduce job.
Parameters:jar - the user jar for the map-reduce job.
getJarUnpackPattern
public&&()
Get the pattern for jar contents to unpack on the tasktracker
setJarByClass
public&void&(&cls)
Set the job's jar file by finding an example class location.
Parameters:cls - the example class.
getLocalDirs
public&[]&()
deleteLocalFiles
public&void&()
Deprecated.&
Use MRAsyncDiskService.moveAndDeleteAllVolumes instead.
deleteLocalFiles
public&void&(&subdir)
getLocalPath
public&&(&pathString)
Constructs a local file name. Files are distributed among configured
local directories.
public&&()
Get the reported username for this job.
Returns:the username
public&void&(&user)
Set the reported username for this job.
Parameters:user - the username for this job.
setKeepFailedTaskFiles
public&void&(boolean&keep)
Set whether the framework should keep the intermediate files for
failed tasks.
Parameters:keep - true if framework should keep the intermediate files
for failed tasks, false otherwise.
getKeepFailedTaskFiles
public&boolean&()
Should the temporary files for failed tasks be kept?
Returns:should the files be kept?
setKeepTaskFilesPattern
public&void&(&pattern)
Set a regular expression for task names that should be kept.
The regular expression ".*_m_" would keep the files
for the first instance of map 123 that ran.
Parameters:pattern - the java.util.regex.Pattern to match against the
task names.
getKeepTaskFilesPattern
public&&()
Get the regular expression that is matched against the task names
to see if we need to keep the files.
Returns:the pattern as a string, if it was set, othewise null.
setWorkingDirectory
public&void&(&dir)
Set the current working directory for the default file system.
Parameters:dir - the new current working directory.
getWorkingDirectory
public&&()
Get the current working directory for the default file system.
Returns:the directory name.
setNumTasksToExecutePerJvm
public&void&(int&numTasks)
Sets the number of tasks that a spawned task JVM should run
before it exits
Parameters:numTasks - the number defaults to 1;
-1 signifies no limit
getNumTasksToExecutePerJvm
public&int&()
Get the number of tasks that a spawned JVM should execute
getInputFormat
public&&()
implementation for the map-reduce job,
defaults to
if not specified explicity.
Returns:the
implementation for the map-reduce job.
setInputFormat
public&void&(&? extends &&theClass)
implementation for the map-reduce job.
Parameters:theClass - the
implementation for the map-reduce
getOutputFormat
public&&()
implementation for the map-reduce job,
defaults to
if not specified explicity.
Returns:the
implementation for the map-reduce job.
getOutputCommitter
public&&()
implementation for the map-reduce job,
defaults to
if not specified explicitly.
Returns:the
implementation for the map-reduce job.
setOutputCommitter
public&void&(&? extends &&theClass)
implementation for the map-reduce job.
Parameters:theClass - the
implementation for the map-reduce
setOutputFormat
public&void&(&? extends &&theClass)
implementation for the map-reduce job.
Parameters:theClass - the
implementation for the map-reduce
setCompressMapOutput
public&void&(boolean&compress)
Should the map outputs be compressed before transfer?
Parameters:compress - should the map outputs be compressed?
getCompressMapOutput
public&boolean&()
Are the outputs of the maps be compressed?
Returns:true if the outputs of the maps are to be compressed,
false otherwise.
setMapOutputCompressorClass
public&void&(&? extends &&codecClass)
Set the given class as the
for the map outputs.
Parameters:codecClass - the
class that will compress
the map outputs.
getMapOutputCompressorClass
public&&? extends &&(&? extends &&defaultValue)
for compressing the map outputs.
Parameters:defaultValue - the
to return if not set
Returns:the
class that should be used to compress the
map outputs.
- if the class was specified, but not found
getMapOutputKeyClass
public&&?&&()
Get the key class for the map output data. If it is not set, use the
(final) output key class. This allows the map output key class to be
different than the final output key class.
Returns:the map output key class.
setMapOutputKeyClass
public&void&(&?&&theClass)
Set the key class for the map output data. This allows the user to
specify the map output key class to be different than the final output
value class.
Parameters:theClass - the map output key class.
getMapOutputValueClass
public&&?&&()
Get the value class for the map output data. If it is not set, use the
(final) output value class This allows the map output value class to be
different than the final output value class.
Returns:the map output value class.
setMapOutputValueClass
public&void&(&?&&theClass)
Set the value class for the map output data. This allows the user to
specify the map output value class to be different than the final output
value class.
Parameters:theClass - the map output value class.
getOutputKeyClass
public&&?&&()
Get the key class for the job output data.
Returns:the key class for the job output data.
setOutputKeyClass
public&void&(&?&&theClass)
Set the key class for the job output data.
Parameters:theClass - the key class for the job output data.
getOutputKeyComparator
public&&()
comparator used to compare keys.
Returns:the
comparator used to compare keys.
setOutputKeyComparatorClass
public&void&(&? extends &&theClass)
comparator used to compare keys.
Parameters:theClass - the
comparator used to
compare keys.See Also:
setKeyFieldComparatorOptions
public&void&(&keySpec)
options used to compare keys.
Parameters:keySpec - the key specification of the form -k pos1[,pos2], where,
pos is of the form f[.c][opts], where f is the number
of the key field to use, and c is the number of the first character from
the beginning of the field. Fields and character posns are numbered
starting with 1; a character position of zero in pos2 indicates the
field's last character. If '.c' is omitted from pos1, it defaults to 1
(the beginning of the field); if omitted from pos2, it defaults to 0
(the end of the field). opts are ordering options. The supported options
-n, (Sort numerically)
-r, (Reverse the result of comparison)
getKeyFieldComparatorOption
public&&()
setKeyFieldPartitionerOptions
public&void&(&keySpec)
options used for
Parameters:keySpec - the key specification of the form -k pos1[,pos2], where,
pos is of the form f[.c][opts], where f is the number
of the key field to use, and c is the number of the first character from
the beginning of the field. Fields and character posns are numbered
starting with 1; a character position of zero in pos2 indicates the
field's last character. If '.c' is omitted from pos1, it defaults to 1
(the beginning of the field); if omitted from pos2, it defaults to 0
(the end of the field).
getKeyFieldPartitionerOption
public&&()
getCombinerKeyGroupingComparator
public&&()
Get the user defined
comparator for
grouping keys of inputs to the combiner.
Returns:comparator set by the user for grouping values.See Also:
getOutputValueGroupingComparator
public&&()
Get the user defined
comparator for
grouping keys of inputs to the reduce.
Returns:comparator set by the user for grouping values.See Also:
setCombinerKeyGroupingComparator
public&void&(&? extends &&theClass)
Set the user defined
comparator for
grouping keys in the input to the combiner.
This comparator should be provided if the equivalence rules for keys
for sorting the intermediates are different from those for grouping keys
before each call to
For key-value pairs (K1,V1) and (K2,V2), the values (V1, V2) are passed
in a single call to the reduce function if K1 and K2 compare as equal.
can be used to control
how keys are sorted, this can be used in conjunction to simulate
secondary sort on values.
Note: This is not a guarantee of the combiner sort being
stable in any sense. (In any case, with the order of available
map-outputs to the combiner being non-deterministic, it wouldn't make
that much sense.)
Parameters:theClass - the comparator class to be used for grouping keys for the
combiner. It should implement RawComparator.See Also:
setOutputValueGroupingComparator
public&void&(&? extends &&theClass)
Set the user defined
comparator for
grouping keys in the input to the reduce.
This comparator should be provided if the equivalence rules for keys
for sorting the intermediates are different from those for grouping keys
before each call to
For key-value pairs (K1,V1) and (K2,V2), the values (V1, V2) are passed
in a single call to the reduce function if K1 and K2 compare as equal.
can be used to control
how keys are sorted, this can be used in conjunction to simulate
secondary sort on values.
Note: This is not a guarantee of the reduce sort being
stable in any sense. (In any case, with the order of available
map-outputs to the reduce being non-deterministic, it wouldn't make
that much sense.)
Parameters:theClass - the comparator class to be used for grouping keys.
It should implement RawComparator.See Also:,
getUseNewMapper
public&boolean&()
Should the framework use the new context-object code for running
the mapper?
Returns:true, if the new api should be used
setUseNewMapper
public&void&(boolean&flag)
Set whether the framework should use the new api for the mapper.
This is the default for jobs submitted with the new Job api.
Parameters:flag - true, if the new api should be used
getUseNewReducer
public&boolean&()
Should the framework use the new context-object code for running
the reducer?
Returns:true, if the new api should be used
setUseNewReducer
public&void&(boolean&flag)
Set whether the framework should use the new api for the reducer.
This is the default for jobs submitted with the new Job api.
Parameters:flag - true, if the new api should be used
getOutputValueClass
public&&?&&()
Get the value class for job outputs.
Returns:the value class for job outputs.
setOutputValueClass
public&void&(&?&&theClass)
Set the value class for job outputs.
Parameters:theClass - the value class for job outputs.
getMapperClass
public&&? extends &&()
class for the job.
Returns:the
class for the job.
setMapperClass
public&void&(&? extends &&theClass)
class for the job.
Parameters:theClass - the
class for the job.
getMapRunnerClass
public&&? extends &&()
class for the job.
Returns:the
class for the job.
setMapRunnerClass
public&void&(&? extends &&theClass)
Expert: Set the
class for the job.
Typically used to exert greater control on s.
Parameters:theClass - the
class for the job.
getPartitionerClass
public&&? extends &&()
used to partition -outputs
to be sent to the s.
Returns:the
used to partition map-outputs.
setPartitionerClass
public&void&(&? extends &&theClass)
class used to partition
-outputs to be sent to the s.
Parameters:theClass - the
used to partition map-outputs.
getReducerClass
public&&? extends &&()
class for the job.
Returns:the
class for the job.
setReducerClass
public&void&(&? extends &&theClass)
class for the job.
Parameters:theClass - the
class for the job.
getCombinerClass
public&&? extends &&()
Get the user-defined combiner class used to combine map-outputs
before being sent to the reducers. Typically the combiner is same as the
for the job i.e. .
Returns:the user-defined combiner class used to combine map-outputs.
setCombinerClass
public&void&(&? extends &&theClass)
Set the user-defined combiner class used to combine map-outputs
before being sent to the reducers.
The combiner is an application-specified aggregation operation, which
can help cut down the amount of data transferred between the
and the , leading to better performance.
The framework may invoke the combiner 0, 1, or multiple times, in both
the mapper and reducer tasks. In general, the combiner is called as the
sort/merge result is written to disk. The combiner must:
be side-effect free
have the same input and output key types and the same input and
output value types
Typically the combiner is same as the Reducer for the
job i.e. .
Parameters:theClass - the user-defined combiner class used to combine
map-outputs.
getSpeculativeExecution
public&boolean&()
Should speculative execution be used for this job?
Defaults to true.
Returns:true if speculative execution be used for this job,
false otherwise.
setSpeculativeExecution
public&void&(boolean&speculativeExecution)
Turn speculative execution on or off for this job.
Parameters:speculativeExecution - true if speculative execution
should be turned on, else false.
getMapSpeculativeExecution
public&boolean&()
Should speculative execution be used for this job for map tasks?
Defaults to true.
Returns:true if speculative execution be
used for this job for map tasks,
false otherwise.
setMapSpeculativeExecution
public&void&(boolean&speculativeExecution)
Turn speculative execution on or off for this job for map tasks.
Parameters:speculativeExecution - true if speculative execution
should be turned on for map tasks,
else false.
getReduceSpeculativeExecution
public&boolean&()
Should speculative execution be used for this job for reduce tasks?
Defaults to true.
Returns:true if speculative execution be used
for reduce tasks for this job,
false otherwise.
setReduceSpeculativeExecution
public&void&(boolean&speculativeExecution)
Turn speculative execution on or off for this job for reduce tasks.
Parameters:speculativeExecution - true if speculative execution
should be turned on for reduce tasks,
else false.
getNumMapTasks
public&int&()
Get configured the number of reduce tasks for this job.
Defaults to 1.
Returns:the number of reduce tasks for this job.
setNumMapTasks
public&void&(int&n)
Set the number of map tasks for this job.
Note: This is only a hint to the framework. The actual
number of spawned map tasks depends on the number of s
generated by the job's .
is typically used to accurately control
the number of map tasks for the job.
How many maps?
The number of maps is usually driven by the total size of the inputs
i.e. total number of blocks of the input files.
The right level of parallelism for maps seems to be around 10-100 maps
per-node, although it has been set up to 300 or so for very cpu-light map
tasks. Task setup takes awhile, so it is best if the maps take at least a
minute to execute.
The default behavior of file-based s is to split the
input into logical s based on the total size, in
bytes, of input files. However, the
blocksize of the
input files is treated as an upper bound for input splits. A lower bound
on the split size can be set via
Thus, if you expect 10TB of input data and have a blocksize of 128MB,
you'll end up with 82,000 maps, unless
used to set it even higher.
Parameters:n - the number of map tasks for this job.See Also:,
getNumReduceTasks
public&int&()
Get configured the number of reduce tasks for this job. Defaults to
Returns:the number of reduce tasks for this job.
setNumReduceTasks
public&void&(int&n)
Set the requisite number of reduce tasks for this job.
How many reduces?
The right number of reduces seems to be 0.95 or
1.75 multiplied by (&no. of nodes& *
With 0.95 all of the reduces can launch immediately and
start transfering map outputs as the maps finish. With 1.75
the faster nodes will finish their first round of reduces and launch a
second wave of reduces doing a much better job of load balancing.
Increasing the number of reduces increases the framework overhead, but
increases load balancing and lowers the cost of failures.
The scaling factors above are slightly less than whole numbers to
reserve a few reduce slots in the framework for speculative-tasks, failures
Reducer NONE
It is legal to set the number of reduce-tasks to zero.
In this case the output of the map-tasks directly go to distributed
file-system, to the path set by
. Also, the
framework doesn't sort the map-outputs before writing it out to HDFS.
Parameters:n - the number of reduce tasks for this job.
getMaxMapAttempts
public&int&()
Get the configured number of maximum attempts that will be made to run a
map task, as specified by the mapreduce.map.maxattempts
property. If this property is not already set, the default is 4 attempts.
Returns:the max number of attempts per map task.
setMaxMapAttempts
public&void&(int&n)
Expert: Set the number of maximum attempts that will be made to run a
Parameters:n - the number of attempts per map task.
getMaxReduceAttempts
public&int&()
Get the configured number of maximum attempts
that will be made to run a
reduce task, as specified by the mapreduce.reduce.maxattempts
property. If this property is not already set, the default is 4 attempts.
Returns:the max number of attempts per reduce task.
setMaxReduceAttempts
public&void&(int&n)
Expert: Set the number of maximum attempts that will be made to run a
reduce task.
Parameters:n - the number of attempts per reduce task.
getJobName
public&&()
Get the user-specified job name. This is only used to identify the
job to the user.
Returns:the job's name, defaulting to "".
setJobName
public&void&(&name)
Set the user-specified job name.
Parameters:name - the job's new name.
getSessionId
public&&()
Deprecated.&
Get the user-specified session identifier. The default is the empty string.
The session identifier is used to tag metric data that is reported to some
performance metrics system via the org.apache.hadoop.metrics API.
session identifier is intended, in particular, for use by Hadoop-On-Demand
(HOD) which allocates a virtual Hadoop cluster dynamically and transiently.
HOD will set the session identifier by modifying the mapred-site.xml file
before starting the cluster.
When not running under HOD, this identifer is expected to remain set to
the empty string.
Returns:the session identifier, defaulting to "".
setSessionId
public&void&(&sessionId)
Deprecated.&
Set the user-specified session identifier.
Parameters:sessionId - the new session id.
setMaxTaskFailuresPerTracker
public&void&(int&noFailures)
Set the maximum no. of failures of a given job per tasktracker.
If the no. of task failures exceeds noFailures, the
tasktracker is blacklisted for this job.
Parameters:noFailures - maximum no. of failures of a given job per tasktracker.
getMaxTaskFailuresPerTracker
public&int&()
Expert: Get the maximum no. of failures of a given job per tasktracker.
If the no. of task failures exceeds this, the tasktracker is
blacklisted for this job.
Returns:the maximum no. of failures of a given job per tasktracker.
getMaxMapTaskFailuresPercent
public&int&()
Get the maximum percentage of map tasks that can fail without
the job being aborted.
Each map task is executed a minimum of
attempts before being declared as failed.
Defaults to zero, i.e. any failed map-task results in
the job being declared as .
Returns:the maximum percentage of map tasks that can fail without
the job being aborted.
setMaxMapTaskFailuresPercent
public&void&(int&percent)
Expert: Set the maximum percentage of map tasks that can fail without the
job being aborted.
Each map task is executed a minimum of
before being declared as failed.
Parameters:percent - the maximum percentage of map tasks that can fail without
the job being aborted.
getMaxReduceTaskFailuresPercent
public&int&()
Get the maximum percentage of reduce tasks that can fail without
the job being aborted.
Each reduce task is executed a minimum of
attempts before being declared as failed.
Defaults to zero, i.e. any failed reduce-task results
in the job being declared as .
Returns:the maximum percentage of reduce tasks that can fail without
the job being aborted.
setMaxReduceTaskFailuresPercent
public&void&(int&percent)
Set the maximum percentage of reduce tasks that can fail without the job
being aborted.
Each reduce task is executed a minimum of
attempts before being declared as failed.
Parameters:percent - the maximum percentage of reduce tasks that can fail without
the job being aborted.
setJobPriority
public&void&(&prio)
for this job.
Parameters:prio - the
for this job.
getJobPriority
public&&()
for this job.
Returns:the
for this job.
getProfileEnabled
public&boolean&()
Get whether the task profiling is enabled.
Returns:true if some tasks will be profiled
setProfileEnabled
public&void&(boolean&newValue)
Set whether the system should collect profiler information for some of
the tasks in this job? The information is stored in the user log
directory.
Parameters:newValue - true means it should be gathered
getProfileParams
public&&()
Get the profiler configuration arguments.
The default value for this property is
"-agentlib:hprof=cpu=samples,heap=sites,force=n,thread=y,verbose=n,file=%s"
Returns:the parameters to pass to the task child to configure profiling
setProfileParams
public&void&(&value)
Set the profiler configuration arguments. If the string contains a '%s' it
will be replaced with the name of the profiling output file when the task
This value is passed to the task child JVM on the command line.
Parameters:value - the configuration string
getProfileTaskRange
public&org.apache.hadoop.conf.Configuration.IntegerRanges&(boolean&isMap)
Get the range of maps or reduces to profile.
Parameters:isMap - is the task a map?
Returns:the task ranges
setProfileTaskRange
public&void&(boolean&isMap,
&newValue)
Set the ranges of maps or reduces to profile. setProfileEnabled(true)
must also be called.
Parameters:newValue - a set of integer ranges of the map ids
setMapDebugScript
public&void&(&mDbgScript)
Set the debug script to run when the map tasks fail.
The debug script can aid debugging of failed map tasks. The script is
given task's stdout, stderr, syslog, jobconf files as arguments.
The debug command, run on the node where the map failed, is:
$script $stdout $stderr $syslog $jobconf.
The script file is distributed through DistributedCache
APIs. The script needs to be symlinked.
Here is an example on how to submit a script
job.setMapDebugScript("./myscript");
DistributedCache.createSymlink(job);
DistributedCache.addCacheFile("/debug/scripts/myscript#myscript");
Parameters:mDbgScript - the script name
getMapDebugScript
public&&()
Get the map task's debug script.
Returns:the debug Script for the mapred job for failed map tasks.See Also:
setReduceDebugScript
public&void&(&rDbgScript)
Set the debug script to run when the reduce tasks fail.
The debug script can aid debugging of failed reduce tasks. The script
is given task's stdout, stderr, syslog, jobconf files as arguments.
The debug command, run on the node where the map failed, is:
$script $stdout $stderr $syslog $jobconf.
The script file is distributed through DistributedCache
APIs. The script file needs to be symlinked
Here is an example on how to submit a script
job.setReduceDebugScript("./myscript");
DistributedCache.createSymlink(job);
DistributedCache.addCacheFile("/debug/scripts/myscript#myscript");
Parameters:rDbgScript - the script name
getReduceDebugScript
public&&()
Get the reduce task's debug Script
Returns:the debug script for the mapred job for failed reduce tasks.See Also:
getJobEndNotificationURI
public&&()
Get the uri to be invoked in-order to send a notification after the job
has completed (success/failure).
Returns:the job end notification uri, null if it hasn't
been set.See Also:
setJobEndNotificationURI
public&void&(&uri)
Set the uri to be invoked in-order to send a notification after the job
has completed (success/failure).
The uri can contain 2 special parameters: $jobId and
$jobStatus. Those, if present, are replaced by the job's
identifier and completion-status respectively.
This is typically used by application-writers to implement chaining of
Map-Reduce jobs in an asynchronous manner.
Parameters:uri - the job end notification uriSee Also:
getJobLocalDir
public&&()
Get job-specific shared directory for use as scratch space
When a job starts, a shared directory is created at location
${mapreduce.cluster.local.dir}/taskTracker/$user/jobcache/$jobid/work/ .
This directory is exposed to the users through
mapreduce.job.local.dir .
So, the tasks can use this space
as scratch space and share files among them.
This value is available as System property also.
Returns:The localized job specific shared directory
getMemoryForMapTask
public&long&()
Get memory required to run a map task of the job, in MB.
If a value is specified in the configuration, it is returned.
Else, it returns MRJobConfig.DEFAULT_MAP_MEMORY_MB.
For backward compatibility, if the job configuration sets the
to a value different
from , that value will be used
after converting it from bytes to MB.
Returns:memory required to run a map task of the job, in MB,
setMemoryForMapTask
public&void&(long&mem)
getMemoryForReduceTask
public&long&()
Get memory required to run a reduce task of the job, in MB.
If a value is specified in the configuration, it is returned.
Else, it returns MRJobConfig.DEFAULT_REDUCE_MEMORY_MB.
For backward compatibility, if the job configuration sets the
to a value different
from , that value will be used
after converting it from bytes to MB.
Returns:memory required to run a reduce task of the job, in MB.
setMemoryForReduceTask
public&void&(long&mem)
getQueueName
public&&()
Return the name of the queue to which this job is submitted.
Defaults to 'default'.
Returns:name of the queue
setQueueName
public&void&(&queueName)
Set the name of the queue to which this job should be submitted.
Parameters:queueName - Name of the queue
normalizeMemoryConfigValue
public static&long&(long&val)
Normalize the negative values in configuration
Parameters:val -
Returns:normalized value
findContainingJar
public static&&(&my_class)
Find a jar that contains a class of the same name, if any.
It will return a jar file, even if that is not the first thing
on the class path that has a class with the same name.
Parameters:my_class - the class to find.
Returns:a jar file that contains the class, or null.
getMaxVirtualMemoryForTask
public&long&()
Deprecated.&Use
Get the memory required to run a task of this job, in bytes. See
This method is deprecated. Now, different memory limits can be
set for map and reduce tasks of a job, in MB.
For backward compatibility, if the job configuration sets the
key , that value is returned.
Otherwise, this method will return the larger of the values returned by
after converting them into bytes.
Returns:Memory required to run a task of this job, in bytes.See Also:
setMaxVirtualMemoryForTask
public&void&(long&vmem)
Deprecated.&Use
Set the maximum amount of memory any task of this job can use. See
mapred.task.maxvmem is split into
mapreduce.map.memory.mb
and mapreduce.map.memory.mb,mapred
each of the new key are set
as mapred.task.maxvmem / 1024
as new values are in MB
Parameters:vmem - Maximum amount of virtual memory in bytes any task of this job
can use.See Also:
getMaxPhysicalMemoryForTask
public&long&()
Deprecated.&this variable is deprecated and nolonger in use.
setMaxPhysicalMemoryForTask
public&void&(long&mem)
Deprecated.&
Copyright © 2015 . All rights reserved.}

51无线网