Hadoop 3.1 is major version of Hadoop 3.x and its released with many features. This article discusses the features of Apache Hadoop 3.1.Hadoop 3.1 is major version of Hadoop 3.x and its released with many features. This article discusses the features of Apache Hadoop 3.1.
Hadoop 3.1 is major release with many significant changes and improvements over previous release Hadoop 3.0. In this article we are discussing the features of Apache Hadoop 3.1 Big Data platform.
Hadoop 3.1.0 comes with new features, bug fixes, improvements and many other changes.
List of Major features of Hadoop 3.1
Hadoop 3.1.0 is another major release of Big Data platform and is targeted to provide high performance compute system to meet the needs of today's machine learning. It supports training and deployment of deep learning frameworks. Data scientist can run jobs in parallel to achieve high performance in near real-time.
This the major feature as Hadoop 3.1.0 allows to run long running jobs natively. YARN provides API and host long running natively which means developers can run long running job using YARN.
In this version YARN comes with advanced features with the support for container orchestration and it manages containerized services. YARN provides the support for docker container and and process based based containers.
YARN provides first class support for GPU scheduling and isolation for docker/non-docker containers. On YARN currently only Nvidia GPUs are supported.
In resource-types.xml you should add following properties to enable GPU:
Support for GPU is very exciting features and now machine learning programs can use GPU on the Hadoop cluster for fast machine learning processing.
In Hadoop 3.1.0 YARN comes with first-class support for FPGA scheduling and isolation. And this is available for both docker/non-docker containers over the YARNresource management and job scheduling technology stack.
Hadoop 3.1.0 provides ability to admin to specify absolute resources in terms of X Memory, Y VCores, Z GPUs, etc. So, its gives better ability to manage processes in the Hadoop cluster. Earlier there was only option of providing the resources in percentage values.
A new storage type PROVIDED is added to the Hadoop 3.1.0 which allows the admin to configure mapped drive. This mapped drive is subsequently managed by HDFS.
Here is the list of new features added in the Hadoop version 3.1.0:
Following are major fixes that comes with the Hadoop 3.1.0:
You can view full details at https://s.apache.org/apache-hadoop-3.1.0-all-tickets.
Check more tutorials at Apache Spark Tutorials.