Hadoop Mania: Performance Monitoring, Testing and Optimizing Hadoop-MapReduce Job using Hadoop Counters

If you run a Hadoop MapReduce Job and check its status page on job tracker for its job id, you will get performance counters like the below.

File System Counters

FILE: Number of bytes read=4386096368

FILE: Number of bytes written=8805370803

FILE: Number of read operations=0

FILE: Number of large read operations=0

FILE: Number of write operations=0

HDFS: Number of bytes read=54583718086

HDFS: Number of bytes written=4382090874

HDFS: Number of read operations=1479

HDFS: Number of large read operations=0

HDFS: Number of write operations=2

Job Counters

Launched map tasks=369

Launched reduce tasks=1

Data-local map tasks=369

Total time spent by all maps in occupied slots (ms)=34288552

Total time spent by all reduces in occupied slots (ms)=232084

Total time spent by all map tasks (ms)=8572138

Total time spent by all reduce tasks (ms)=58021

Total vcore-seconds taken by all map tasks=8572138

Total vcore-seconds taken by all reduce tasks=58021

Total megabyte-seconds taken by all map tasks=35111477248

Total megabyte-seconds taken by all reduce tasks=237654016

Map-Reduce Framework

Map input records=14753874

Map output records=666776

Map output bytes=4383426830

Map output materialized bytes=4386098552

Input split bytes=47970

Combine input records=0

Combine output records=0

Reduce input groups=1

Reduce shuffle bytes=4386098552

Reduce input records=666776

Reduce output records=666776

Spilled Records=1333552

Shuffled Maps =369

Failed Shuffles=0

Merged Map outputs=369

GC time elapsed (ms)=1121584

CPU time spent (ms)=23707900

Physical memory (bytes) snapshot=152915259392

Virtual memory (bytes) snapshot=2370755190784

Total committed heap usage (bytes)=126644912128

Shuffle Errors

BAD_ID=0

CONNECTION=0

IO_ERROR=0

WRONG_LENGTH=0

WRONG_MAP=0

WRONG_REDUCE=0

File Input Format Counters

Bytes Read=49449743227

File Output Format Counters

Bytes Written=4382090874

Counters are used to determine if and how often a particular event occurred during a job execution.

A task’s counters are sent in full every time, rather than sending the counts since the last transmission, since this guards against errors due to lost messages. Furthermore, during a job run, counters may go down if a task fails. Counter values are definitive only once a job has successfully completed.

Performance Monitoring: Tracking the counter values provides vital stats about the mapreduce job. In addition to stats on mapreduce job and its tasks, it also provide stats on operation done on underlying filesystems.

Performance Testing: Knowing the expected values of counters provide first line of performance validation for mapreduce jobs. It helps in ascertaining whether mapreduce job is behaving as expected by the ideal mapreduce paradigm. For example, the simple check that sum of number of records produced by mapper should be equal to total number of records consumed by producer.

Performance Optimization: This forms the second line of performance validation for mapreduce jobs. Counter values gave hints on what can be done to optimize that mapreduce job further.

To describe in detail how the above objectives of Performance Monitoring, Testing and Optimization can be achieved using Counters, I have listed in tabular format all mapreduce counters, their description, ideal expected values and how to optimize hints.

There are 2 categories of counters in Hadoop: In-built(file system, job, framework) and custom.

In-built Counters

Source Code hadoop-mapreduce-client-core/2.6.0/org/apache/hadoop/mapreduce

Counter	Brief Description	Detailed Description	Expected value for ideally optimized MapReduce Performance	How to Optimize in case of not expected value/ Bad Performance
Counter Group: FileSystem Counters (Source FileName in Hadoop Code = FileSystemCounter.properties) (CounterGroupName = File System Counters)
FOR each filesystem(HDFS, MAPRFS, S3):
· <FS>_BYTES_READ	Number of bytes read from filesystem, <FS>	The total number of bytes read from a specific filesystem.	No ideal expected value as it depends on volume of data being read.	It could be reduced by using efficient serialization techniques (like Avro, Parquet, etc) and compression algorithms (like LZO, Snappy, bzip2 etc). Choices of algorithms purely depend on use case. For example, Parquet is useful only if read pattern for data is limited to some columns. Also, Tradeoff of time taken to serialize-deserialize and compress-decompress should be taken into consideration.
· <FS>_BYTES_WRITTEN	Number of bytes written to filesystem, <FS>	The total number of bytes written into a specific filesystem.	No ideal expected value as it depends on volume of data being generated by mapreduce algorithm.
Example, FOR MAPRFS:
· MAPRFS_BYTES_READ	total number of bytes read from mapr file system.	Off the shelf counter for HDFS_BYTES_READ.
· MAPRFS_BYTES_WRITTEN	total number of bytes written to mapr file system.	Off the shelf counter for HDFS_BYTES_WRITTEN.
FILE_BYTES_READ	total number of bytes read from local file system	This counter is incremented for each byte read from the local file system. These writes occur: 1) during the shuffle phase, when the map phase intermediate data is read by HTTP worker threads(“tasktracker.http.threads”) into reducer tasks. 2) during the reduce phase, when reducers read grouped and sorted data from local filesystem.	No ideal expected value as it depends on shuffle phase and reducer logic.	It could be reduced by using efficient compression algorithms (like LZO, Snappy, bzip2 etc). Also, Tradeoff of time taken to compress-decompress should be taken into consideration
FILE_BYTES_WRITTEN	total number of bytes written to local file system	This counter is incremented for each byte written to the local file system. These writes occur: 1) during the map phase when the mappers write their intermediate results to the local file system. 2) during the shuffle phase when the reducers spill intermediate results to their local disks while sorting.	No ideal expected value as it depends on mapper logic and shuffle phase.	It could be reduced by using efficient compression algorithms (like LZO, Snappy, bzip2 etc). Also, Tradeoff of time taken to compress-decompress should be taken into consideration
READ_OPS	Number of read operations	It is accumulated at the client per file system. It is number of read operations such as listStatus, getFileBlockLocations, open etc.	This is useful in the interim to identify jobs that heavily load HDFS.
LARGE_READ_OPS	Number of large read operations	It is accumulated at the client per file system. On file system, most of the operations are small except listFiles for a large directory. Iterative listFiles was introduced in HDFS to break down a single large operation into smaller steps. This counter is incremented for every iteration of listFiles, when listing files under a large directory.	This is useful in the interim to identify jobs that heavily load HDFS.
WRITE_OPS	Number of write operations	It is accumulated at the client per file system. It is number of write operations such as create, append, setPermission etc.	This is useful in the interim to identify jobs that heavily load HDFS.

Counter Group: Job Counters (Source FileName in Hadoop Code = JobCounter.properties) (CounterGroupName= Job Counters)
SLOTS_MILLIS_MAPS	Total time spent by all maps in occupied slots (ms)	This value indicates wall clock time for the map tasks. It equals to FALLOW_SLOTS_MILLIS_MAPS + MILLIS_MAPS	Optimally, should be equal to MILLIS_MAPS	Reduce time spent in FALLOW_SLOTS_MILLIS_MAPS as much as possible
FALLOW_SLOTS_MILLIS_MAPS	Total time spent by all maps waiting after reserving slots (ms)	Indicates how much time map tasks wait in the queue after the slots are reserved but before the map tasks execute. Slot reservation is a capacity scheduler feature for memory-intensive jobs. Not used by YARN-based mapreduce.	Optimally, should be zero	A high number indicates a possible mismatch between the number of slots configured for a task tracker and how many resources are actually available. The way to reduce it can be one of the following: 1) Increase the number of nodes in cluster as it will distribute the data and computing capacity further. Overall, load on a particular node will decrease. 2) Reduce the mapper-task capacity of task-tracker nodes with low level of resources. Reduce the value of “mapred.tasktracker.map.tasks.maximum” on that task-tracker’s mapred-site.xml
MILLIS_MAPS	Total time spent by all map tasks (ms)	The total time taken in running all the map tasks in ms, including speculative tasks	No ideal optimal value as it depends on mapper logic.	Optimize the mapper code as much as possible by profiling it for memory leaks and following best coding standards.
SLOTS_MILLIS_REDUCES	Total time spent by all reduces in occupied slots (ms)	This value indicates wall clock time for the reduce tasks. It equals to FALLOW_SLOTS_MILLIS_REDUCES + MILLIS_REDUCES	Optimally, should be equal to MILLIS_REDUCES	Reduce time spent in FALLOW_SLOTS_MILLIS_REDUCES as much as possible
FALLOW_SLOTS_MILLIS_REDUCES	Total time spent by all reduces waiting after reserving slots (ms)	indicates how much time map tasks wait in the queue after the slots are reserved but before the reduce tasks execute. Slot reservation is a capacity scheduler feature for memory-intensive jobs. Not used by YARN-based mapreduce.	Optimally, should be zero	A high number indicates a possible mismatch between the number of slots configured for a task tracker and how many resources are actually available. The way to reduce it can be one of the following: 1) Increase the number of nodes in cluster as it will distribute the data and computing capacity further. Overall, load on a particular node will decrease. 2) Reduce the reducer-task capacity of task-tracker nodes with low level of resources. Reduce the value of “mapred.tasktracker.reduce.tasks.maximum” on that task-tracker’s mapred-site.xml
MILLIS_REDUCES	Total time spent by all reduce tasks (ms)	The total time taken in running all the reduce tasks in ms, including speculative tasks	No ideal optimal value as it depends on reducer logic.	Optimize the reducer code as much as possible by profiling it for memory leaks and following best coding standards.
VCORES_MILLIS_MAPS	Total vcore-seconds taken by all map tasks	Vcore stands for virtual core in CPU of a computer. This counter measures the cpu resources used by all the mappers. It is the aggregated number of vcores that each mapper had been allocated times the number of seconds that mapper had run. The number of vcores allocated to a mapper is controlled by property “mapreduce.map.cpu.vcores” (default 1) And in yarn, it is controlled by “yarn.nodemanager.resource.cpu-vcores” (default 8). In simple language, it is equal to (“mapreduce.map.cpu.vcores” X MILLIS_MAPS)	It gives the value of cpu resources that were blocked by mappers. The ideal optimal value is as much low value for this counter as possible.	To reduce value to optimize the value of this counter, either reduce the number of vcore allocated to each mapper or the time taken to execute a mapper instance. In other words, mapper should be made optimized and less resource-intensive as much possible. Then, reduce value of “mapreduce.map.cpu.vcores” that fits its requirement perfectly.
VCORES_MILLIS_REDUCES	Total vcore-seconds taken by all reduce tasks	Vcore stands for virtual core in CPU of a computer. This counter measures the cpu resources used by all the reducers. It is the aggregated number of vcores that each reducer had been allocated times the number of seconds that reducer had run. The number of vcores allocated to a reducer is controlled by property “mapreduce.reduce.cpu.vcores” (default 1) And in yarn, it is controlled by “yarn.nodemanager.resource.cpu-vcores” (default 8). In simple language, it is equal to (“mapreduce.reduce.cpu.vcores” X MILLIS_REDUCES)	It gives the value of cpu resources that were blocked by reducers. The ideal optimal value is as much low value for this counter as possible.	To reduce value to optimize the value of this counter, either reduce the number of vcore allocated to each reducer or the time taken to execute a mapper instance. In other words, mapper should be made optimized and less resource-intensive as much possible. Then, reduce value of “mapreduce.reduce.cpu.vcores” that fits its requirement perfectly.
MB_MILLIS_MAPS	Total megabyte-seconds taken by all map tasks	This counter measures the memory resources used by all the mappers. It is the aggregated amount of memory (in megabytes) that each mapper had been allocated times the number of seconds that mapper had run. The amount of memory(MB) allocated to a mapper is controlled by property “mapreduce.map.memory.mb” (default 1024) And in yarn, it is controlled by “yarn.nodemanager.resource.memory-mb” (default 8192). In simple language, it is equal to (“mapreduce.map.memory.mb” X MILLIS_MAPS)	It gives the value of memory resources that were blocked by mappers. The ideal optimal value is as much low value for this counter as possible.	To reduce value to optimize the value of this counter, either reduce the number of memory-mb allocated to each mapper or the time taken to execute a mapper instance. In other words, mapper should be made optimized and less resource-intensive as much possible. Then, reduce value of “mapreduce.map.memory.mb” that fits its requirement perfectly.
MB_MILLIS_REDUCES	Total megabyte-seconds taken by all reduce tasks	This counter measures the memory resources used by all the reducers. It is the aggregated amount of memory (in megabytes) that each reducer had been allocated times the number of seconds that reducer had run. The amount of memory(MB) allocated to a reducer is controlled by property “mapreduce.reduce.memory.mb” (default 1024) And in yarn, it is controlled by “yarn.nodemanager.resource.memory-mb” (default 8192). In simple language, it is equal to (“mapreduce.reduce.memory.mb” X MILLIS_ REDUCES)	It gives the value of memory resources that were blocked by reducers. The ideal optimal value is as much low value for this counter as possible.	To reduce value to optimize the value of this counter, either reduce the number of memory-mb allocated to each reducer or the time taken to execute a reducer instance. In other words, mapper should be made optimized and less resource-intensive as much possible. Then, reduce value of “mapreduce.reduce.memory.mb” that fits its requirement perfectly.
DATA_LOCAL_MAPS	Data-local map tasks	Total number of map tasks run on local data blocks (data locality).	It should be as much near to total maps as possible. Optimally, all the map tasks will execute on local data to exploit locality of reference	The only solution is to ensure maximum data locality. The way to ensure it can be one of the following: 1) Increase the number of nodes in cluster as it will distribute the data and computing capacity further. Overall, load on a particular node will decrease and probability of map task getting data locally will increase. 2) Increase the mapper-task capacity of task-tracker nodes with high level of resources. Increase the value of “mapred.tasktracker.reduce.tasks.maximum” on that task-tracker’s mapred-site.xml
RACK_LOCAL_MAPS	Rack-local map tasks	Total number of map tasks for which data blocks are on same rack as node on which map task is executing.	Ideally should be zero.
OTHER_LOCAL_MAPS	Other local map tasks tasks	Total number of map tasks for which data blocks are on some other rack as node on which map task is executing.	Ideally should be zero.
TOTAL_LAUNCHED_MAPS	Launched map tasks.	It defines how many map tasks were launched for the job, including failed tasks, killed tasks and tasks that were started speculatively.	Optimally, this number is the same as the number of splits for the job.	In case of value not maching number of input splits, verify the InputFormat used.
TOTAL_LAUNCHED_REDUCES	Launched reduce tasks	It defines how many reduce tasks were launched for the job, including failed tasks, killed tasks and tasks that were started speculatively.	Optimally, should be equal to the number of reducers configured for that mapreduce job (default is 2 reducers)
NUM_FAILED_MAPS	Failed map tasks	The number of map attempts/tasks that were failed. The reason of failure could be runtime exception in code or errors like Out Of Memory, I/O errors, etc.	Ideally should be zero.
NUM_FAILED_REDUCES	Failed reduce tasks	The number of reduce attempts/tasks that were failed. The reason of failure could be runtime exception in code or errors like Out Of Memory, I/O errors, etc.	Ideally should be zero.
NUM_KILLED_MAPS	Killed map tasks	The number of map attempts/tasks that were killed. The reason of killing could be speculative execution (in which slower task is killed) or failure of any node on which task is running.	Ideally should be zero.	A high number indicates too many tasks getting killed due to speculative execution or node failures. The way to reduce the instances of speculative execution can be one of the following: 1) Increase the number of nodes in cluster as it will distribute the data and computing capacity further. Overall, load on a particular node will decrease. 2) Remove or minimize the number of nodes with low level of resources from cluster.
NUM_KILLED_REDUCES	Killed reduce tasks	The number of reduce attempts/tasks that were killed. The reason of killing could be speculative execution (in which slower task is killed) or failure of any node on which task is running.	Ideally should be zero.

· Counter Group: Map-Reduce Framework Counters · (Source FileName in Hadoop Code = TaskCounter.properties) · (CounterGroupName= Map-Reduce Framework)
SPLIT_RAW_BYTES	Input split bytes	The total number of bytes of input-split object consumed read by maps. These objects represent the split metadata (offset and length within a file) rather than split data itself.	No ideal optimal value
MAP_INPUT_BYTES	Map input bytes	The number of bytes of uncompressed input read by all the maps in the job.	Must be equal to number of bytes in input data
MAP_INPUT_RECORDS	Map input records	The number of input records consumed by all the maps in the job. It is incremented for every successful record read from RecordReader and passed to the mapper’s map method. Records that the map tasks failed to read are not included in these counters.	Must be equal to number of records in input data	In case of non-matching value, check RecordReader logic
MAP_SKIPPED_RECORDS	Map skipped records.	The number of Input records skipped by all the maps in the job. Skipped records are bad/corrupt records that are making the map task attempts to fail due to run-time exceptions in third-party libraries being used in mapper. Skipping mode is turned on for a task only after it has failed twice. For a task consistently failing on a bad record, the tasktracker runs the following task attempts with these outcomes: 1. Task fails. 2. Task fails. 3. Skipping mode is enabled. Task fails but failed record is stored by the tasktracker. 4. Skipping mode is still enabled. Task succeeds by skipping the bad record that failed in the previous attempt. It can detect only one bad record per task attempt, so this mechanism is appropriate only for detecting occasional bad records. To give skipping mode enough attempts to detect and skip all the bad records in an input split, increase the value of “mapred.map.max.attempts”. Bad records are saved as sequence files in the job’s output directory under the _logs/skip subdirectory.	Ideally, should be 0
MAP_OUTPUT_RECORDS	Map output records	The number of map output records produced by all the maps in the job. It is incremented for every successful record written by the mappers. Records that the map tasks failed to write are not included in these counters.	No ideal optimal value as it depends on mapper logic.
MAP_OUTPUT_BYTES	Map output bytes	The number of bytes of uncompressed output produced by all the maps in the job. Incremented every time collect() method is called on a map’s Output Collector.	No ideal optimal value as it depends on mapper logic.	To optimize this counter, use Hadoop Serialization or Avro to serialize the Mapper output.
MAP_OUTPUT_MATERIALIZED_BYTES	Map output materialized bytes	The number of bytes of map output actually written to disk. This is visible and relevant only if map output compression is enabled. In that case, it is the number of bytes of compressed output produced by all the maps in the job.	No ideal optimal value as it depends on mapper logic.
COMBINE_INPUT_RECORDS	Combine input records	It indicates the number of records that were read by the optional combiner. Incremented every time a value is read from the combiner’s iterator over values	If you don’t specify a combiner, these counters should be 0. Otherwise, should be equal to MAP_OUTPUT_RECORDS	In case of unexpected value, check combiner logic
COMBINE_OUTPUT_RECORDS	Combine output records	It indicates the number of records that were written by the optional combiner. Incremented every time the collect() method is called on a combiner’s OutputCollector.	If you don’t specify a combiner, these counters should be 0. Otherwise, should be equal to REDUCE_INPUT_RECORDS	In case of unexpected value, check combiner logic
REDUCE_INPUT_RECORDS	Reduce input records	It indicates how many records were successfully read by the reducers. Incremented every time a value is read from the reducer’s iterator over values.	The input record counter should be equal to the MAP_OUTPUT_RECORDS counter, in case of no combiner. Otherwise, should be equal to REDUCE_INPUT_RECORDS
REDUCE_SKIPPED_RECORDS	Reduce skipped records	The number of Input records skipped by all the reducers in the job. Skipped records are bad/corrupt records that are making the reduce task attempts to fail due to run-time exceptions in third-party libraries being used in reducer. Skipping mode is turned on for a task only after it has failed twice. For a task consistently failing on a bad record, the tasktracker runs the following task attempts with these outcomes: 1. Task fails. 2. Task fails. 3. Skipping mode is enabled. Task fails but failed record is stored by the tasktracker. 4. Skipping mode is still enabled. Task succeeds by skipping the bad record that failed in the previous attempt. It can detect only one bad record per task attempt, so this mechanism is appropriate only for detecting occasional bad records. To give skipping mode enough attempts to detect and skip all the bad records in an input split, increase the value of “mapred.reduce.max.attempts”. Bad records are saved as sequence files in the job’s output directory under the _logs/skip subdirectory.	Ideally should be zero.
REDUCE_INPUT_GROUPS	Reduce input groups	The total number of distinct unique key groups consumed by all the reducers in the job. It is incremented for every unique key that the reducers process. Incremented every time the reducer’s reduce() method is called by the framework.	This value should be equal to the total number of different keys in the intermediate results from the mappers.
REDUCE_SKIPPED_GROUPS	Reduce skipped groups	The number of distinct key groups skipped by all the reducers in the job.	This value should be zero
REDUCE_OUTPUT_RECORDS	Reduce output records	It indicates how many records were successfully written by all the reducers in the job. Incremented every time the collect() method is called on a reducer’s OutputCollector.	Ideally, ratio of (REDUCE_OUTPUT_RECORDS/ MAP_INPUT_RECORDS) should be less than 1 and as much closer to 0 as possible.	Although even if ratio is > 1, it is not invalid. It only violates the basic assumption of mapreduce paradigm that reduce-phase aggregates the data considerably.
REDUCE_SHUFFLE_BYTES	Reduce shuffle bytes	It indicates how many bytes of the map output were copied by the shuffle to the reducers.	Should be as much lower as possible. Higher numbers here will make the job go slower as the shuffle process is the primary network consumer in the MapReduce job.	Compress and Serialize the Mapper Output. It could be reduced by using efficient serialization techniques (like Avro, Parquet, etc) and compression algorithms (like LZO, Snappy, bzip2 etc). Choices of algorithms purely depend on use case. For example, Parquet is useful only if read pattern for data is limited to some columns. Also, Tradeoff of time taken to serialize-deserialize and compress-decompress should be taken into consideration.
SPILLED_RECORDS	Spilled Records	It indicates how much data the map and reduce tasks wrote (spilled) to disk when processing the records.	Must be equal to sum of REDUCE_INPUT_RECORDS and MAP_OUTPUT_RECORDS. In case, combiner is also invoked, It must be equal to sum of REDUCE_INPUT_RECORDS and COMBINE_OUTPUT_RECORDS. Should be as much lower as possible as it involves costly Disk I/O.
SHUFFLED_MAPS	Shuffled Maps	The number of map output files transferred to reducers by shuffle. In shuffle phase, each mapper’s output is partitioned and sorted into files, one for each reducer. There might be scenario that keys are not uniformly distributed across mappers and some mappers don’t generate partitions for all reducers.	Must be less than or equal to product of number of successful mappers and successful reducers. Mathematically, must be less than or equal to (TOTAL_LAUNCHED_MAPS – NUM_KILLED_MAPS – NUM_FAILED_MAPS) * (TOTAL_LAUNCHED_REDUCES – NUM_KILLED_ REDUCES – NUM_FAILED_ REDUCES)
FAILED_SHUFFLE	Failed Shuffles	The number of map output copy failures during the shuffle.	Should be zero.
MERGED_MAP_OUTPUTS	Merged Map outputs	The number of map outputs that have been merged on the reduce side of the shuffle.	Must be equal to the value of SHUFFLED_MAPS
CPU_MILLISECONDS	CPU time spent (ms)	Total time spent by all tasks on CPU. It is gathered from /proc/cpuinfo and indicate how much total time was spent executing map and reduce tasks for a job.	Should be as much lower as possible.
GC_TIME_MILLIS	GC time elapsed (ms)	The total time spent in milliseconds doing garbage collection. The garbage collection counter is reported from GarbageCollectorMXBean.getCollectionTime(). The Garbage Collection happens in Java to collect unused objects (objects that are no longer referenced). If garbage collection is frequent and represents a lot of time, you may be allocating unnecessary objects.	It should be closer to 0 as much as possible	If GC_TIME is amounting to considerable proportion of task run time, follow steps can help 1) Add -verbose:gc -XX:+PrintGCDetails to “mapred.child.java.opts”. Then inspect the logs for some tasks (mappers or reducers). If new objects are being created (like “new Text” or “new IntWritable”) unnecessarily or inside a loop or inside a function, try to optimize it. The Hadoop Writable Objects are mutable. 2) If too much object creation is necessity, try to increase heap-size allocation to task attempt jvm by adding –Xmx parameter to “mapred.map.child.java.opts” for map tasks and “mapred.reduce.child.java.opts” for reduce tasks. For example, to set committed heap usage to 4GB for map tasks, set “–Xmx4096m” to “mapred.map.child.java.opts”.
VIRTUAL_MEMORY_BYTES	Virtual memory (bytes) snapshot	Total number of Virtual Memory Bytes (RAM+SWAP) consumed by all the tasks. It shows how much physical memory plus swap space is consumed by all tasks in a job.	(VIRTUAL_MEMORY_BYTES - PHYSICAL_MEMORY_BYTES ) indicates how much swap space or paged memory is used for a job, which should be close to 0 as much as possible	The only reason for high value of swap space is memory committed to current tasks on a node far exceeding available memory resource on that node. The reason could be other system processes or applications consuming memory. This will result in pagination of memory which is being indicated by high value of swap space. The way to ensure low value of swap memory can be one of the following: 1) Increase the number of nodes in cluster as it will distribute the data and computing capacity further. Overall, load on a particular node will decrease. 2) Decrease the mapper-task capacity and reducer-task capacity of task-tracker nodes having low level of resources. Decrease the value of “mapred.tasktracker.map.tasks.maximum” and “mapred.tasktracker.reduce.tasks.maximum” on that task-tracker’s mapred-site.xml
PHYSICAL_MEMORY_BYTES	Physical memory (bytes) snapshot	Total Physical Memory consumed by all the tasks. These statistics are gathered from /proc/meminfo and indicate how much RAM (not including swap space) was consumed by all the tasks.
COMMITTED_HEAP_BYTES	Total committed heap usage (bytes)	The total amount of memory available in the JVM in bytes, as reported by Runtime.getRuntime().totalMemory()	Should be close to total task requirement

References:

3 comments:

venkat8 August 2017 at 23:38
I am reading your post from the beginning, it was so interesting to read & I feel thanks to you for posting such a good blog, keep updates regularly.

Hadoop Online Training
R Programming Online Training|
Data Science Online Training|
Unknown5 May 2018 at 03:47
Hi,Your post on hadoop performance monitoring and testing was the best post and I understood the concepts very well and thanks for posting Hadoop Training in Velachery | Hadoop Training .
Veera Blogspot14 October 2020 at 22:26
Very nice article post,Thank you for sharing this awesome blog.
keep updating more big data hadoop tutorials.

Big Data Online Training

Hadoop Mania

Wednesday, 21 October 2015

Performance Monitoring, Testing and Optimizing Hadoop-MapReduce Job using Hadoop Counters

In-built Counters

Source Code hadoop-mapreduce-client-core/2.6.0/org/apache/hadoop/mapreduce

References:

3 comments:

About Me