<返回更多

两种 java 向 yarn 提交 spark 任务命令的区别

2022-07-28    Java熬夜党
加入收藏

核心代码

private void pi() {         log.info("----- start pi -----");        final String JAVAHome = System.getenv("JAVA_HOME");        final String hadoopConfDir = System.getenv("HADOOP_CONF_DIR");        log.info("javaHome: " + javaHome);        log.info("hadoopConfDir: " + hadoopConfDir);        log.info("sparkHome: " + sparkHome);        log.info("mode: " + deployMode);        log.info("AppResource: " + sparkJar);        log.info("mainClass: " + mainClass);        final String[] args = new String[]{ "--jar", sparkJar, "--class", mainClass, "--arg", "10"};        String appName = "spark-yarn";        System.setProperty("SPARK_YARN_MODE", "true");        SparkConf sparkConf = new SparkConf();        sparkConf.setSparkHome(sparkHome);        sparkConf.setMaster("yarn");        sparkConf.setAppName(appName);        sparkConf.set("spark.submit.deployMode", "cluster");        String jarDir = "hdfs://sh01:9000/user/deployer/spark-jars/*.jar";        log.info("jarDir: " + jarDir);        sparkConf.set("spark.yarn.jars", jarDir);        if (enableKerberos) {             log.info("---------------- enable kerberos ------------------");            sparkConf.set("spark.hadoop.hadoop.security.authentication", "kerberos");            sparkConf.set("spark.hadoop.hadoop.security.authorization", "true");            sparkConf.set("spark.hadoop.dfs.namenode.kerberos.principal", "hdfs/_HOST@KPP.COM");            sparkConf.set("spark.hadoop.yarn.resourcemanager.principal", "yarn/_HOST@KPP.COM");        }        ClientArguments clientArguments = new ClientArguments(args);        Client client = new Client(clientArguments, sparkConf);//        client.run();        ApplicationId applicationId = client.submitApplication();        log.info("submit task [{}] and application id [{}] ", appName, applicationId.getId());        YarnAppReport yarnAppReport = client.monitorApplication(applicationId, false, true, 1000);        log.info("task [{}] process result [{}]", appName, yarnAppReport.finalState());        if (yarnAppReport.finalState().equals(FinalApplicationStatus.SUCCEEDED)) {             log.info("spark任务执行成功");        } else {             log.info("spark任务执行失败");        }        log.info("----- finish pi -----");    }

两种提交方式有什么区别

client.run() 是同步的,spark 任务结束前该行一下的代码不会执行。该方法的无返回值,也就是说拿不到 spark 任务执行的任何信息。

client.submitApplication() 是异步的,提交任务后立即执行该行下的代码。但是该方法会返回 ApplicationId ,这个就很有用啦。接下来可以调用 monitorApplication 方法让 java 代码 block 住,并且拿到 spark 任务执行的一些信息。

YarnAppReport yarnAppReport = client.monitorApplication(applicationId, false, true, 1000);
public YarnAppReport monitorApplication(final ApplicationId appId, final boolean returnOnRunning, final boolean logApplicationReport, final long interval) { // 代码就不贴了,有需要自己去看喽。}
声明:本站部分内容来自互联网,如有版权侵犯或其他问题请与我们联系,我们将立即删除或处理。
▍相关推荐
更多资讯 >>>